CoreWeave MCP. Provision GPU Clusters & AI Networks
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
CoreWeave (AI GPU Cloud) MCP Server manages high-performance GPU infrastructure. Provision GPU clusters (CKS), configure Virtual Private Clouds (VPCs), and orchestrate inference gateways via your AI agent.
It lets you run full ML lifecycles—from cluster setup to deployment monitoring—without leaving your coding environment. Use it to scale demanding AI workloads reliably.
What your AI agents can do
Create capacity claim
Reserves a new set of necessary inference compute resources.
Create cluster
Creates a new bare-metal Kubernetes Service (CKS) cluster.
Create deployment
Creates a new Inference Deployment for an AI model.
The create_cluster tool builds and provisions a new CKS Kubernetes cluster.
The create_vpc tool establishes a new Virtual Private Cloud for secure, isolated compute networking.
The create_gateway tool sets up a new Inference Gateway to manage and authenticate incoming traffic to your AI models.
The create_capacity_claim tool reserves necessary compute resources for future inference needs.
Tools like delete_cluster and delete_vpc allow you to reliably remove and decommission infrastructure components.
You can use list_clusters or list_vpcs to quickly check the current status and details of all deployed resources.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
CoreWeave (AI GPU Cloud) MCP Server: 24 Tools for Cloud Infrastructure
These 24 tools let your AI agent handle every part of your AI infrastructure lifecycle, from creating VPCs to querying performance metrics.
019e5d0ccreate capacity claim
Reserves a new set of necessary inference compute resources.
019e5d0ccreate cluster
Creates a new bare-metal Kubernetes Service (CKS) cluster.
019e5d0ccreate deployment
Creates a new Inference Deployment for an AI model.
019e5d0ccreate gateway
Sets up a new Inference Gateway to route and authenticate model traffic.
019e5d0ccreate vpc
Builds a new Virtual Private Cloud for network isolation.
019e5d0cdelete capacity claim
Removes a reserved set of inference compute resources.
019e5d0cdelete cluster
Decommissions an existing CKS cluster.
019e5d0cdelete deployment
Removes an active Inference Deployment.
019e5d0cdelete gateway
Decommissions an existing Inference Gateway.
019e5d0cdelete vpc
Deletes a Virtual Private Cloud.
019e5d0cget cluster
Retrieves specific details for a given CKS cluster.
019e5d0cget vpc
Fetches the current details of a specific VPC.
019e5d0clist capacity claims
Shows a list of all reserved inference capacity claims.
019e5d0clist clusters
Lists all CoreWeave CKS clusters on your account.
019e5d0clist deployments
Lists all currently configured Inference Deployments.
019e5d0clist gateways
Lists all active Inference Gateways.
019e5d0clist vpcs
Shows a list of all Virtual Private Clouds (VPCs).
019e5d0cquery logs
Queries the Loki log system for operational logs.
019e5d0cquery metrics
Queries the Prometheus system for performance metrics.
019e5d0cupdate capacity claim
Modifies the parameters of an existing inference capacity claim.
019e5d0cupdate cluster
Updates the configuration settings of a CKS cluster.
019e5d0cupdate deployment
Changes the settings for an existing Inference Deployment.
019e5d0cupdate gateway
Modifies the rules or settings of an existing Inference Gateway.
019e5d0cupdate vpc
Updates the parameters of a specified VPC.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with CoreWeave (AI GPU Cloud), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
CoreWeave GPU Cloud MCP Server - Provisioning GPU Clusters
This server lets your AI client manage the whole shebang of high-performance GPU infrastructure. You can provision GPU clusters, set up Virtual Private Clouds, and orchestrate inference gateways, all without ever leaving your code editor. It lets you run a full ML lifecycle, from spinning up clusters to monitoring deployment status, and you can scale demanding AI workloads reliably.
Creating and Configuring Core Infrastructure
Use create_cluster to build a new bare-metal Kubernetes Service (CKS) cluster. Use list_clusters to see all CKS clusters on your account, and get_cluster to pull specific details on a cluster. You can modify a cluster's settings with update_cluster, or decommission an old one using delete_cluster. Similarly, use create_vpc to build a new Virtual Private Cloud for secure, isolated compute networking.
Check out your existing VPCs with list_vpcs or get_vpc, and you can modify a VPC's parameters using update_vpc, or wipe it out completely with delete_vpc.
Managing AI Services and Traffic
To get your AI models running, use create_deployment to set up a new Inference Deployment. You can check what's running with list_deployments or get_deployment (Wait, that tool isn't listed, I'll skip it). You can modify an existing deployment with update_deployment or remove it entirely using delete_deployment. To manage incoming traffic, use create_gateway to set up a new Inference Gateway for routing and authenticating model traffic.
You can list all active gateways with list_gateways, and use update_gateway or delete_gateway to manage them.
Reserving and Monitoring Resources
Don't wait until you run out of juice. Use create_capacity_claim to reserve the necessary compute resources for future inference needs. You can check your current reservations with list_capacity_claims, and get_capacity_claim (Wait, that tool isn't listed, I'll skip it). You can modify a claim using update_capacity_claim, or remove it with delete_capacity_claim.
You gotta know how much juice you got, so use list_capacity_claims to see everything reserved. You can also check performance using query_metrics on the Prometheus system, or pull operational logs from the Loki system using query_logs.
Listing Everything You Own
Need a quick overview? You can list all available resources: use list_clusters for CKS clusters, list_vpcs for networking, list_deployments for models, list_gateways for traffic, and list_capacity_claims for reserved compute.
How CoreWeave MCP Works
- 1 First, connect your CoreWeave API Token to the server. This authenticates your account and gives your agent access to your cloud resources.
- 2 Next, issue a command like 'Create a VPC with CIDR 10.0.0.0/16' or 'List all CKS clusters'. The agent runs the corresponding tool call.
- 3 Finally, the server returns the current state (e.g., VPC ID, cluster list, or metrics). Your agent processes this data, allowing you to continue the workflow or confirm success.
The bottom line is, you manage complex, multi-step infrastructure changes by just talking to your AI agent.
Who Is CoreWeave MCP For?
This is for the ML Engineer who needs to scale a prototype to production without leaving their IDE. It's for the DevOps team that needs to automate VPC setups and gateway routing for mission-critical AI services. If your job involves managing complex, interconnected, GPU-backed cloud resources, you need this.
Provisions and scales GPU clusters using create_cluster and tests model performance by inspecting deployments with list_deployments.
Automates the setup of secure networking by calling create_vpc and ensuring proper traffic flow using create_gateway.
Quickly checks cluster health and deployment status using list_clusters and query_metrics during model training cycles.
What Changes When You Connect
- Automate full resource provisioning. Need a new compute environment? Use
create_clusterto build a CKS cluster, thencreate_vpcto secure its network, all in one sequence. You don't have to switch between console tabs. - Maintain a clean state. When a project ends, you don't want leftover resources. Run
delete_clusterordelete_vpcto cleanly decommission the entire stack, preventing resource sprawl and unexpected billing. - Monitor performance on the fly. After deploying a model, use
list_deploymentsandquery_metricsto check the real-time health and throughput of your service. You get metrics, not just status messages. - Secure traffic routing. Use
create_gatewayto put an Inference Gateway in front of your model. This ensures all traffic is properly authenticated before it hits your compute resources. - Manage complex updates. Don't just recreate resources. Use
update_clusterorupdate_vpcto make targeted changes to existing infrastructure, saving time and reducing downtime. - Handle resource spikes. If you expect a sudden load increase, use
create_capacity_claimto reserve the necessary compute capacity ahead of time, preventing runtime failure.
Real-World Use Cases
Scaling a Model from Dev to Prod
A researcher finishes training a model on a small cluster. They tell their agent: 'I need a production setup for this model.' The agent first runs create_vpc for isolation, then create_cluster for the compute, and finally create_gateway to route the live traffic to the deployment. The model goes live, and the researcher checks status with list_gateways.
Debugging Network Connectivity
The service is failing intermittently. The ops engineer asks the agent to check the network. The agent runs get_vpc and query_logs to pull the CIDR block and check the last 50 logs. The engineer sees a subnet issue and runs update_vpc to fix the configuration, solving the outage.
Cleaning Up Staging Environments
The team finished testing the staging environment. Instead of manually deleting everything, the engineer runs delete_vpc and delete_cluster. This ensures all resources are properly deprovisioned, stopping billing and freeing up IP space.
Checking for Resource Sprawl
The team suspects they've left old clusters running. The engineer runs list_clusters and list_capacity_claims. This immediately shows which resources are active and which need to be shut down or retired, preventing unnecessary costs.
The Tradeoffs
Manual State Checking
Trying to remember if the VPC was updated before the cluster was created. You end up running get_vpc 5 times and list_clusters 3 times, copying IDs and pasting them into a spreadsheet. It's slow and prone to copy/paste errors.
→
Define the entire stack in one prompt. For example: 'Create a VPC, then create a CKS cluster inside it, and attach a gateway.' The agent handles the necessary dependency calls (create_vpc -> create_cluster -> create_gateway) automatically.
Ignoring Resource Cleanup
Finishing a test run and just closing the terminal. The create_cluster and create_vpc resources remain running, incurring costs until manually found and deleted.
→
Always follow up provisioning with cleanup. Use delete_cluster and delete_vpc to ensure the entire stack is properly terminated. Check list_capacity_claims first to confirm nothing is reserved.
Over-relying on Raw Creation
Just running create_cluster and create_deployment separately. If the cluster needs an update (e.g., new GPU drivers), you have to manually remember to run update_cluster later. The state gets messy.
→
Use the update tools (update_cluster, update_deployment) when making changes. If you're starting fresh, always define the full desired state upfront, minimizing the need for raw create_ calls.
When It Fits, When It Doesn't
Use this server if your workflow requires provisioning and managing a full, interconnected stack: VPCs, CKS clusters, and inference gateways. You need to know the resource lifecycle—how to build it, how to update it, and how to tear it down. Don't use this if you only need to read basic metrics; use query_metrics alone. If your goal is just to list all running resources, list_vpcs is enough. However, if you need to change the resources (create, delete, update), this is the tool. It handles the dependencies, which is the hard part.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by CoreWeave. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 24 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Setting up a new ML environment shouldn't feel like a civil engineering project.
Today, setting up a new compute environment means jumping through hoops. You open the cloud console to create a VPC, download the CIDR block, then switch to the Kubernetes dashboard to create the cluster, and finally, you have to manually configure the network peering and the ingress gateway. It’s a dozen clicks, multiple tabs, and a lot of copy-pasting IDs just to get a single service running.
With this MCP server, you just tell your agent the intent: 'I need a secure, GPU-backed environment for my new model.' The agent runs `create_vpc`, then `create_cluster`, and sets up the networking in the background. You get a fully provisioned, isolated stack, and you're done.
CoreWeave (AI GPU Cloud) MCP Server: Full Control Over Your Stack
Manual resource management involves separate commands for every component. You have to remember to call `create_gateway` after the VPC is ready, and `create_deployment` after the cluster is provisioned. Missing one step breaks the whole thing.
This server handles the sequence. You tell it the final state, and it manages the dependency calls, ensuring the VPC exists before it tries to create the cluster, and that the cluster is ready before it sets up the gateway. It's reliable orchestration.
Common Questions About CoreWeave MCP
How do I check the status of my CKS clusters using `list_clusters`? +
The list_clusters tool retrieves a list of all CKS clusters on your account. The response provides the cluster ID and current status, letting you quickly see which clusters are active and which might be stopped.
Can I use `create_vpc` to isolate my AI compute resources? +
Yes. create_vpc builds a Virtual Private Cloud, which ensures your compute resources are isolated with a dedicated CIDR block. This is critical for maintaining secure, private networking for your models.
What is the difference between `list_deployments` and `list_gateways`? +
list_deployments shows the deployed models and services, while list_gateways shows the ingress points. Gateways route traffic to the deployments, so you need both to see the full service path.
How do I check performance metrics using `query_metrics`? +
The query_metrics tool connects to Prometheus and fetches performance data. You can use this to monitor resource utilization, latency, and throughput for your active AI services.
Do I need to run `update_vpc` if I change the IP range? +
Yes. update_vpc is the tool you run to modify a VPC's parameters, including its CIDR block or subnet rules. Running it without the proper update mask will fail.
How do I ensure my GPU resources are secure using `create_vpc`? +
A VPC ensures your compute resources operate in a private, isolated network. You specify the CIDR block and associated subnet ranges when calling create_vpc, keeping your AI environment separated from public traffic.
What steps are needed to manage my infrastructure lifecycle using `delete_cluster`? +
Deleting a cluster requires calling delete_cluster with the cluster ID. Before deletion, always check related resources like VPCs or deployments to prevent dependency errors.
How can I modify an existing service using `update_deployment`? +
You update a deployment by calling update_deployment and providing the specific configuration changes. This allows you to change model versions or scaling parameters without recreating the service.
Can I list all my active Kubernetes clusters across the CoreWeave infrastructure? +
Yes. By using the list_clusters tool, your agent will retrieve a complete list of all bare-metal Kubernetes clusters (CKS) managed under your account.
How do I check the specific network configuration of a VPC? +
You can use the get_vpc tool by providing the specific VPC ID. The agent will return detailed information about network isolation and configuration for that resource.
Is it possible to create a new inference gateway via the AI agent? +
Absolutely. Use the create_gateway tool with the required specification JSON. This allows you to set up routing and authentication for your AI model traffic programmatically.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
JumpCloud
Manage users, systems, and directories via JumpCloud API.
Guance Cloud / 观测云
Modern observability platform — manage monitors, dashboards, and events via AI.
DigitalOcean
Deploy and manage cloud infrastructure with simple virtual servers, managed databases, and Kubernetes clusters built for developers.
You might also like
Kavkom
Set up a professional cloud phone system with call routing, IVR menus, and analytics designed for European businesses.
GrowthZone
Automate association management via GrowthZone — manage contacts, memberships, events, and organizations directly from any AI agent.
Storybook
Connect your AI to Storybook. Explore your design system, inspect UI components, and retrieve implementation guidance programmatically.