CoreWeave MCP for AI. Orchestrate full-stack AI infrastructure via conversation.

Q: How do I list my active clusters using the listclusters tool?

Use the agent to call listclusters. This immediately returns a comprehensive catalog of all managed Kubernetes clusters, letting you know exactly what compute power you have available.

Q: What is the difference between createvpc and updatevpc?

Use createvpc when you need a new network boundary. Use updatevpc only if you need to change an existing VPC's parameters, like widening its IP range.

Q: Should I use listgateways or getgateway first?

You should always run listgateways first. This gives you a high-level view of all your entry points; then, if you need details on one, you can ask the agent to fetch it using the specific ID.

Q: Does createdeployment handle scaling?

The createdeployment tool sets up the endpoint. If you need to scale it later due to increased traffic, use updatedeployment to adjust its capacity instead.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

CoreWeave (AI GPU Cloud) MCP lets you manage specialized, high-performance GPU infrastructure using natural language. You can provision entire clusters, set up secure network boundaries with VPCs, and orchestrate model inference gateways—all without leaving your AI agent environment.

What your AI can do

Get vpc

Fetches the current configuration and status of a specified VPC.

Create capacity claim

Requests a new amount of capacity needed for running inference models.

Create cluster

Builds and provisions a new Kubernetes cluster optimized for AI workloads.

+ 21 more capabilities included

Provision AI Clusters

Create and manage dedicated GPU clusters optimized for intensive machine learning and large-scale AI workloads.

Isolate Network Segments

Build secure Virtual Private Clouds (VPCs) to keep your compute resources separated from other networks.

Deploy Model Endpoints

Set up and manage inference gateways that handle traffic routing and authentication for deployed AI models.

Monitor Infrastructure Health

Check the status of clusters, deployments, and network resources to ensure everything is running optimally.

Manage Resource Lifecycles

Perform full operations—creation, updating, and deletion—across all core compute and networking components.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

CoreWeave (AI GPU Cloud) MCP: 24 Tools

This collection of tools allows you to perform every operation needed for AI cloud infrastructure, managing everything from core networking to model deployment endpoints.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using CoreWeave (AI GPU Cloud) on Vinkius

Get Vpc

Fetches the current configuration and status of a specified VPC.

Create Capacity Claim

Requests a new amount of capacity needed for running inference models.

Create Cluster

Builds and provisions a new Kubernetes cluster optimized for AI workloads.

Update Cluster

Changes configurations for a live Kubernetes cluster, like scaling node counts.

Update Deployment

Adjusts parameters of an existing inference deployment, maybe changing the model...

Update Gateway

Modifies rules or routing policies on an inference gateway.

Update Vpc

Changes network settings, like CIDR blocks, for a VPC.

Create Deployment

Sets up a new instance where an already trained model can receive traffic...

Create Gateway

Establishes a new entry point to route and verify access for AI services.

Create Vpc

Creates a brand new, secure network boundary (VPC) for your resources.

Delete Capacity Claim

Removes an existing request for inference capacity.

Delete Cluster

Decommissions and removes a Kubernetes cluster.

Delete Deployment

Takes down a deployed AI model endpoint.

Delete Gateway

Removes an inference gateway, stopping all traffic routing through it.

Delete Vpc

Permanently deletes a Virtual Private Cloud network boundary.

Get Cluster

Retrieves specific details about an existing Kubernetes cluster.

List Capacity Claims

Lists all active requests for inference capacity.

List Clusters

Retrieves a list of all managed Kubernetes clusters.

List Deployments

Shows a catalog of all currently running model deployments.

List Gateways

Lists every inference gateway configured for traffic routing.

List Vpcs

Retrieves a summary of all existing network VPCs.

Query Logs

Queries historical logs from the system for debugging purposes.

Query Metrics

Retrieves performance metrics data points (e.g., CPU, GPU utilization).

Update Capacity Claim

Modifies the size or parameters of an existing capacity claim.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The CoreWeave integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "coreweave-ai-gpu-cloud": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the CoreWeave tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"coreweave-ai-gpu-cloud": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with CoreWeave (AI GPU Cloud), then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by CoreWeave. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 24 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Managing AI infrastructure means juggling too many moving parts.

Right now, spinning up an ML environment is a nightmare of tabs. You have to jump into the cloud console to set up the network boundaries (the VPC), then spin off another toolset just for compute clusters, and finally configure traffic routing with dedicated gateway tools. It's slow, it takes copy-pasting IDs everywhere, and if you miss one step, your whole deployment fails.

With this MCP, you talk to your agent like talking to a teammate. You simply tell it what the final goal is—say, 'I need production access for Model Y.' The system handles the sequence: creating the VPC, provisioning the cluster, setting up the gateway, and deploying the model. You just get the functional endpoint.

You Get Full Control Over Your Infrastructure With CoreWeave MCP

The tedious manual steps of creating a network boundary, setting up resource quotas, and then linking them to an inference deployment all disappear. You don't manage the state transitions; the agent does.

You gain reliable automation across your entire stack. It’s not just about running commands; it’s about guaranteeing that every component—from `create_vpc` to `update_gateway`—is configured correctly and in order.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

Need to run a big ML job or deploy a new model? This MCP lets you manage all the underlying hardware—the GPU compute power and networking—directly through conversation. You can tell your agent to set up an isolated network for your test models, then spin up a dedicated cluster optimized for training, and finally route traffic to the finished service using inference gateways.

It’s about making sure your entire AI stack runs reliably at scale. The process covers everything from creating a VPC to monitoring resource usage; you just talk to it. This capability is hosted on Vinkius, giving you access to this essential cloud control panel right alongside hundreds of other specialized MCP tools.

Built · Hosted · Managed by Vinkius CoreWeave MCP - Orchestrate AI GPU Clusters

Server ID 019e5d0c-b044-7059-bb3b-b8a390f1b41f

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

What Changes When You Connect

You don't have to manually jump between consoles. You can ask your agent to list all clusters, then check metrics for a specific deployment, and finally create a new VPC—all in one chat session.

The toolset gives you full lifecycle control. Need to scale? Use update_cluster. Finished with a test environment? Run delete_vpc to clean up everything, preventing cloud waste.

Debugging is faster. If an endpoint fails, use query_logs and query_metrics immediately after listing the deployments via list_deployments. You get context instantly.

Security configuration becomes simple. Instead of manual firewall rules, you just tell your agent to create_vpc, guaranteeing network isolation for sensitive models.

Deployment complexity drops. From creating a new VPC to setting up a gateway and deploying the model, every step is handled with one set of commands.

See it in action

01 01

The Model Promotion Pipeline

A research team finishes training a model on an existing cluster. They need to move it into production. The agent first uses list_clusters to find the target environment, then calls create_deployment using the new model artifact, and finally establishes traffic via create_gateway. It’s fully automated.

02 02

Network Audit and Compliance

Compliance requires proving that all development environments are isolated. The agent uses list_vpcs to map out every network segment, then calls get_vpc on each one to confirm the required security rules are in place.

03 03

Post-Mortem Failure Analysis

A live AI service suddenly slows down. The ops engineer uses list_deployments to find the service, then immediately runs query_metrics and get_cluster details to diagnose if the bottleneck is compute or network related.

04 04

Environment Teardown

A project concludes, leaving behind numerous resources. The agent systematically uses delete_vpc, delete_gateway, and delete_cluster to ensure zero lingering costs and a clean cloud slate.

The honest tradeoffs

Manual Scripting for State Changes

Anti-pattern

Writing long shell scripts that chain together commands like 'create VPC, then create cluster, then update gateway.' If one command fails halfway through, the whole thing breaks and requires manual rollback.

The Fix

Use your agent to orchestrate the entire process. Instead of writing a script, tell it: 'Set up a new production environment for Model X.' The MCP handles the complex sequencing, ensuring that if create_gateway fails, it can attempt cleanup.

Ignoring Dependencies

Anti-pattern

Trying to update an inference deployment (update_deployment) before confirming its underlying cluster is stable or available. You'll get vague errors and waste time.

The Fix

First, run get_cluster on the target resource. Verify its status is 'Active.' Then, proceed with update_deployment. Always check the state first.

Over-provisioning Resources

Anti-pattern

Creating a VPC and cluster just to test a single small model, leading to unnecessary cost and complexity.

The Fix

Always start small. Use create_capacity_claim for initial testing instead of immediately creating full clusters. Scale up only after verifying stability.

When It Fits, When It Doesn't

Use this MCP if your job requires managing the entire lifecycle of complex, interdependent AI infrastructure: VPCs must talk to Clusters which host Deployments accessed by Gateways. Don't use it if you just need a simple API call or want to manage a single resource in isolation (like checking one log entry). For simple tasks like fetching logs, use query_logs directly; don't initiate a full cluster setup process. If your workflow involves creating and managing multiple interconnected components, this MCP handles the sequencing for you.

Questions you might have

How do I list my active clusters using the list_clusters tool? +

Use the agent to call list_clusters. This immediately returns a comprehensive catalog of all managed Kubernetes clusters, letting you know exactly what compute power you have available.

What is the difference between create_vpc and update_vpc? +

Use create_vpc when you need a new network boundary. Use update_vpc only if you need to change an existing VPC's parameters, like widening its IP range.

Should I use list_gateways or get_gateway first? +

You should always run list_gateways first. This gives you a high-level view of all your entry points; then, if you need details on one, you can ask the agent to fetch it using the specific ID.

Does create_deployment handle scaling? +

The create_deployment tool sets up the endpoint. If you need to scale it later due to increased traffic, use update_deployment to adjust its capacity instead.

When should I use `create_vpc` versus just relying on my existing network setup? +

You create a VPC when you need strict, isolated networking for specific compute resources. This ensures your GPU nodes and services are separated from other traffic by a defined CIDR block. It’s the first step if security isolation is your top priority.

I'm debugging performance issues; what kind of data do I get when running `query_metrics`? +

The query returns detailed Prometheus metrics, allowing you to monitor real-time resource utilization and latency. You can check specific endpoints or aggregate usage across your cluster fleet to identify bottlenecks.

If a project finishes, what’s the proper sequence for cleanup using `delete_cluster`? +

You must delete resources in reverse order: first gateways and deployments, then the VPC, and finally the cluster itself. Deleting everything systematically prevents orphaned network rules or billing issues.

Do I need to run `list_capacity_claims` before attempting to update a deployment? +

It's smart practice to list claims first. This lets you verify your current resource reservation status and ensure that the updated deployment still falls within an available, budgeted capacity claim.

Can I list all my active Kubernetes clusters across the CoreWeave infrastructure? +

Yes. By using the list_clusters tool, your agent will retrieve a complete list of all bare-metal Kubernetes clusters (CKS) managed under your account.

How do I check the specific network configuration of a VPC? +

You can use the get_vpc tool by providing the specific VPC ID. The agent will return detailed information about network isolation and configuration for that resource.

Is it possible to create a new inference gateway via the AI agent? +

Absolutely. Use the create_gateway tool with the required specification JSON. This allows you to set up routing and authentication for your AI model traffic programmatically.

Connect to your AI in seconds.

Get vpc

Create capacity claim

Create cluster

CoreWeave (AI GPU Cloud) MCP: 24 Tools

Make your AI actually useful.

Get Vpc

Create Capacity Claim

Create Cluster

Update Cluster

Update Deployment

Update Gateway

Update Vpc

Create Deployment

Create Gateway

Create Vpc

Delete Capacity Claim

Delete Cluster

Delete Deployment

Delete Gateway

Delete Vpc

Get Cluster

List Capacity Claims

List Clusters

List Deployments

List Gateways

List Vpcs

Query Logs

Query Metrics

Update Capacity Claim

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Managing AI infrastructure means juggling too many moving parts.

You Get Full Control Over Your Infrastructure With CoreWeave MCP

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

The Model Promotion Pipeline

Network Audit and Compliance

Post-Mortem Failure Analysis

Environment Teardown

The honest tradeoffs

Manual Scripting for State Changes

Ignoring Dependencies