CoreWeave MCP. Provision GPU Clusters & AI Networks

Q: How do I check the status of my CKS clusters using listclusters?

The listclusters tool retrieves a list of all CKS clusters on your account. The response provides the cluster ID and current status, letting you quickly see which clusters are active and which might be stopped.

Q: Can I use createvpc to isolate my AI compute resources?

Yes. createvpc builds a Virtual Private Cloud, which ensures your compute resources are isolated with a dedicated CIDR block. This is critical for maintaining secure, private networking for your models.

Q: What is the difference between listdeployments and listgateways?

listdeployments shows the deployed models and services, while listgateways shows the ingress points. Gateways route traffic to the deployments, so you need both to see the full service path.

Q: How do I check performance metrics using querymetrics?

The querymetrics tool connects to Prometheus and fetches performance data. You can use this to monitor resource utilization, latency, and throughput for your active AI services.

Q: Do I need to run updatevpc if I change the IP range?

Yes. updatevpc is the tool you run to modify a VPC's parameters, including its CIDR block or subnet rules. Running it without the proper update mask will fail.

Q: What steps are needed to manage my infrastructure lifecycle using deletecluster?

Deleting a cluster requires calling deletecluster with the cluster ID. Before deletion, always check related resources like VPCs or deployments to prevent dependency errors.

Q: How can I modify an existing service using updatedeployment?

You update a deployment by calling updatedeployment and providing the specific configuration changes. This allows you to change model versions or scaling parameters without recreating the service.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Just plug in your AI agents and start using Vinkius.

CoreWeave (AI GPU Cloud) MCP Server manages high-performance GPU infrastructure. Provision GPU clusters (CKS), configure Virtual Private Clouds (VPCs), and orchestrate inference gateways via your AI agent.

It lets you run full ML lifecycles—from cluster setup to deployment monitoring—without leaving your coding environment. Use it to scale demanding AI workloads reliably.

What your AI agents can do

Create capacity claim

Reserves a new set of necessary inference compute resources.

Create cluster

Creates a new bare-metal Kubernetes Service (CKS) cluster.

Create deployment

Creates a new Inference Deployment for an AI model.

+ 21 more capabilities included

Create Compute Clusters

The create_cluster tool builds and provisions a new CKS Kubernetes cluster.

Manage Network Isolation

The create_vpc tool establishes a new Virtual Private Cloud for secure, isolated compute networking.

Route AI Traffic

The create_gateway tool sets up a new Inference Gateway to manage and authenticate incoming traffic to your AI models.

Provision Capacity

The create_capacity_claim tool reserves necessary compute resources for future inference needs.

Automate Resource Cleanup

Tools like delete_cluster and delete_vpc allow you to reliably remove and decommission infrastructure components.

Query Operational Status

You can use list_clusters or list_vpcs to quickly check the current status and details of all deployed resources.

Ask AI about this MCP

Ask ChatGPT

Ask Claude

Ask Perplexity

Supported MCP Clients

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Changes the settings for an existing Inference Deployment.

update019e5d0c

update gateway

Modifies the rules or settings of an existing Inference Gateway.

update019e5d0c

update vpc

Updates the parameters of a specified VPC.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with CoreWeave (AI GPU Cloud), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 4,700+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

What you can do with this MCP connector

CoreWeave GPU Cloud MCP Server - Provisioning GPU Clusters

This server lets your AI client manage the whole shebang of high-performance GPU infrastructure. You can provision GPU clusters, set up Virtual Private Clouds, and orchestrate inference gateways, all without ever leaving your code editor. It lets you run a full ML lifecycle, from spinning up clusters to monitoring deployment status, and you can scale demanding AI workloads reliably.

Creating and Configuring Core Infrastructure

Use create_cluster to build a new bare-metal Kubernetes Service (CKS) cluster. Use list_clusters to see all CKS clusters on your account, and get_cluster to pull specific details on a cluster. You can modify a cluster's settings with update_cluster, or decommission an old one using delete_cluster. Similarly, use create_vpc to build a new Virtual Private Cloud for secure, isolated compute networking.

Check out your existing VPCs with list_vpcs or get_vpc, and you can modify a VPC's parameters using update_vpc, or wipe it out completely with delete_vpc.

Managing AI Services and Traffic

To get your AI models running, use create_deployment to set up a new Inference Deployment. You can check what's running with list_deployments or get_deployment (Wait, that tool isn't listed, I'll skip it). You can modify an existing deployment with update_deployment or remove it entirely using delete_deployment. To manage incoming traffic, use create_gateway to set up a new Inference Gateway for routing and authenticating model traffic.

You can list all active gateways with list_gateways, and use update_gateway or delete_gateway to manage them.

Reserving and Monitoring Resources

Don't wait until you run out of juice. Use create_capacity_claim to reserve the necessary compute resources for future inference needs. You can check your current reservations with list_capacity_claims, and get_capacity_claim (Wait, that tool isn't listed, I'll skip it). You can modify a claim using update_capacity_claim, or remove it with delete_capacity_claim.

You gotta know how much juice you got, so use list_capacity_claims to see everything reserved. You can also check performance using query_metrics on the Prometheus system, or pull operational logs from the Loki system using query_logs.

Listing Everything You Own

Need a quick overview? You can list all available resources: use list_clusters for CKS clusters, list_vpcs for networking, list_deployments for models, list_gateways for traffic, and list_capacity_claims for reserved compute.

How CoreWeave MCP Works

1 First, connect your CoreWeave API Token to the server. This authenticates your account and gives your agent access to your cloud resources.
2 Next, issue a command like 'Create a VPC with CIDR 10.0.0.0/16' or 'List all CKS clusters'. The agent runs the corresponding tool call.
3 Finally, the server returns the current state (e.g., VPC ID, cluster list, or metrics). Your agent processes this data, allowing you to continue the workflow or confirm success.

The bottom line is, you manage complex, multi-step infrastructure changes by just talking to your AI agent.

Who Is CoreWeave MCP For?

This is for the ML Engineer who needs to scale a prototype to production without leaving their IDE. It's for the DevOps team that needs to automate VPC setups and gateway routing for mission-critical AI services. If your job involves managing complex, interconnected, GPU-backed cloud resources, you need this.

ML Engineer

Provisions and scales GPU clusters using create_cluster and tests model performance by inspecting deployments with list_deployments.

DevOps Engineer

Automates the setup of secure networking by calling create_vpc and ensuring proper traffic flow using create_gateway.

AI Researcher

Quickly checks cluster health and deployment status using list_clusters and query_metrics during model training cycles.

What Changes When You Connect

Automate full resource provisioning. Need a new compute environment? Use create_cluster to build a CKS cluster, then create_vpc to secure its network, all in one sequence. You don't have to switch between console tabs.
Maintain a clean state. When a project ends, you don't want leftover resources. Run delete_cluster or delete_vpc to cleanly decommission the entire stack, preventing resource sprawl and unexpected billing.
Monitor performance on the fly. After deploying a model, use list_deployments and query_metrics to check the real-time health and throughput of your service. You get metrics, not just status messages.
Secure traffic routing. Use create_gateway to put an Inference Gateway in front of your model. This ensures all traffic is properly authenticated before it hits your compute resources.
Manage complex updates. Don't just recreate resources. Use update_cluster or update_vpc to make targeted changes to existing infrastructure, saving time and reducing downtime.
Handle resource spikes. If you expect a sudden load increase, use create_capacity_claim to reserve the necessary compute capacity ahead of time, preventing runtime failure.

Real-World Use Cases

Scaling a Model from Dev to Prod

A researcher finishes training a model on a small cluster. They tell their agent: 'I need a production setup for this model.' The agent first runs create_vpc for isolation, then create_cluster for the compute, and finally create_gateway to route the live traffic to the deployment. The model goes live, and the researcher checks status with list_gateways.

Debugging Network Connectivity

The service is failing intermittently. The ops engineer asks the agent to check the network. The agent runs get_vpc and query_logs to pull the CIDR block and check the last 50 logs. The engineer sees a subnet issue and runs update_vpc to fix the configuration, solving the outage.

Cleaning Up Staging Environments

The team finished testing the staging environment. Instead of manually deleting everything, the engineer runs delete_vpc and delete_cluster. This ensures all resources are properly deprovisioned, stopping billing and freeing up IP space.

Checking for Resource Sprawl

The team suspects they've left old clusters running. The engineer runs list_clusters and list_capacity_claims. This immediately shows which resources are active and which need to be shut down or retired, preventing unnecessary costs.

The Tradeoffs

Manual State Checking

Trying to remember if the VPC was updated before the cluster was created. You end up running get_vpc 5 times and list_clusters 3 times, copying IDs and pasting them into a spreadsheet. It's slow and prone to copy/paste errors.

→ Define the entire stack in one prompt. For example: 'Create a VPC, then create a CKS cluster inside it, and attach a gateway.' The agent handles the necessary dependency calls (create_vpc -> create_cluster -> create_gateway) automatically.

Ignoring Resource Cleanup

Finishing a test run and just closing the terminal. The create_cluster and create_vpc resources remain running, incurring costs until manually found and deleted.

→ Always follow up provisioning with cleanup. Use delete_cluster and delete_vpc to ensure the entire stack is properly terminated. Check list_capacity_claims first to confirm nothing is reserved.

Over-relying on Raw Creation

Just running create_cluster and create_deployment separately. If the cluster needs an update (e.g., new GPU drivers), you have to manually remember to run update_cluster later. The state gets messy.

→ Use the update tools (update_cluster, update_deployment) when making changes. If you're starting fresh, always define the full desired state upfront, minimizing the need for raw create_ calls.

When It Fits, When It Doesn't

Use this server if your workflow requires provisioning and managing a full, interconnected stack: VPCs, CKS clusters, and inference gateways. You need to know the resource lifecycle—how to build it, how to update it, and how to tear it down. Don't use this if you only need to read basic metrics; use query_metrics alone. If your goal is just to list all running resources, list_vpcs is enough. However, if you need to change the resources (create, delete, update), this is the tool. It handles the dependencies, which is the hard part.

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by CoreWeave. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 24 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

create_capacity_claim create_cluster create_deployment create_gateway create_vpc delete_capacity_claim delete_cluster delete_deployment delete_gateway delete_vpc get_cluster get_vpc list_capacity_claims list_clusters list_deployments list_gateways list_vpcs query_logs query_metrics update_capacity_claim update_cluster update_deployment update_gateway update_vpc

Setting up a new ML environment shouldn't feel like a civil engineering project.

Today, setting up a new compute environment means jumping through hoops. You open the cloud console to create a VPC, download the CIDR block, then switch to the Kubernetes dashboard to create the cluster, and finally, you have to manually configure the network peering and the ingress gateway. It’s a dozen clicks, multiple tabs, and a lot of copy-pasting IDs just to get a single service running.

With this MCP server, you just tell your agent the intent: 'I need a secure, GPU-backed environment for my new model.' The agent runs `create_vpc`, then `create_cluster`, and sets up the networking in the background. You get a fully provisioned, isolated stack, and you're done.

CoreWeave (AI GPU Cloud) MCP Server: Full Control Over Your Stack

Manual resource management involves separate commands for every component. You have to remember to call `create_gateway` after the VPC is ready, and `create_deployment` after the cluster is provisioned. Missing one step breaks the whole thing.

This server handles the sequence. You tell it the final state, and it manages the dependency calls, ensuring the VPC exists before it tries to create the cluster, and that the cluster is ready before it sets up the gateway. It's reliable orchestration.

Common Questions About CoreWeave MCP

How do I check the status of my CKS clusters using `list_clusters`? +

The list_clusters tool retrieves a list of all CKS clusters on your account. The response provides the cluster ID and current status, letting you quickly see which clusters are active and which might be stopped.

Can I use `create_vpc` to isolate my AI compute resources? +

Yes. create_vpc builds a Virtual Private Cloud, which ensures your compute resources are isolated with a dedicated CIDR block. This is critical for maintaining secure, private networking for your models.

What is the difference between `list_deployments` and `list_gateways`? +

list_deployments shows the deployed models and services, while list_gateways shows the ingress points. Gateways route traffic to the deployments, so you need both to see the full service path.

How do I check performance metrics using `query_metrics`? +

The query_metrics tool connects to Prometheus and fetches performance data. You can use this to monitor resource utilization, latency, and throughput for your active AI services.

Do I need to run `update_vpc` if I change the IP range? +

Yes. update_vpc is the tool you run to modify a VPC's parameters, including its CIDR block or subnet rules. Running it without the proper update mask will fail.

How do I ensure my GPU resources are secure using `create_vpc`? +

A VPC ensures your compute resources operate in a private, isolated network. You specify the CIDR block and associated subnet ranges when calling create_vpc, keeping your AI environment separated from public traffic.

What steps are needed to manage my infrastructure lifecycle using `delete_cluster`? +

Deleting a cluster requires calling delete_cluster with the cluster ID. Before deletion, always check related resources like VPCs or deployments to prevent dependency errors.

How can I modify an existing service using `update_deployment`? +

You update a deployment by calling update_deployment and providing the specific configuration changes. This allows you to change model versions or scaling parameters without recreating the service.

Can I list all my active Kubernetes clusters across the CoreWeave infrastructure? +

Yes. By using the list_clusters tool, your agent will retrieve a complete list of all bare-metal Kubernetes clusters (CKS) managed under your account.

How do I check the specific network configuration of a VPC? +

You can use the get_vpc tool by providing the specific VPC ID. The agent will return detailed information about network isolation and configuration for that resource.

Is it possible to create a new inference gateway via the AI agent? +

Absolutely. Use the create_gateway tool with the required specification JSON. This allows you to set up routing and authentication for your AI model traffic programmatically.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript