CoreWeave MCP for AI. Orchestrate full-stack AI infrastructure via conversation.
Works with every AI agent you already use
…and any MCP-compatible client








Connect to your AI in seconds.
CoreWeave (AI GPU Cloud) MCP lets you manage specialized, high-performance GPU infrastructure using natural language. You can provision entire clusters, set up secure network boundaries with VPCs, and orchestrate model inference gateways—all without leaving your AI agent environment.
What your AI can do
Get vpc
Fetches the current configuration and status of a specified VPC.
Create capacity claim
Requests a new amount of capacity needed for running inference models.
Create cluster
Builds and provisions a new Kubernetes cluster optimized for AI workloads.
Create and manage dedicated GPU clusters optimized for intensive machine learning and large-scale AI workloads.
Build secure Virtual Private Clouds (VPCs) to keep your compute resources separated from other networks.
Set up and manage inference gateways that handle traffic routing and authentication for deployed AI models.
Check the status of clusters, deployments, and network resources to ensure everything is running optimally.
Perform full operations—creation, updating, and deletion—across all core compute and networking components.
Ask an AI about this
Waiting for input…
CoreWeave (AI GPU Cloud) MCP: 24 Tools
This collection of tools allows you to perform every operation needed for AI cloud infrastructure, managing everything from core networking to model deployment endpoints.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using CoreWeave (AI GPU Cloud) on VinkiusGet Vpc
Fetches the current configuration and status of a specified VPC.
Create Capacity Claim
Requests a new amount of capacity needed for running inference models.
Create Cluster
Builds and provisions a new Kubernetes cluster optimized for AI workloads.
Update Cluster
Changes configurations for a live Kubernetes cluster, like scaling node counts.
Update Deployment
Adjusts parameters of an existing inference deployment, maybe changing the model...
Update Gateway
Modifies rules or routing policies on an inference gateway.
Update Vpc
Changes network settings, like CIDR blocks, for a VPC.
Create Deployment
Sets up a new instance where an already trained model can receive traffic...
Create Gateway
Establishes a new entry point to route and verify access for AI services.
Create Vpc
Creates a brand new, secure network boundary (VPC) for your resources.
Delete Capacity Claim
Removes an existing request for inference capacity.
Delete Cluster
Decommissions and removes a Kubernetes cluster.
Delete Deployment
Takes down a deployed AI model endpoint.
Delete Gateway
Removes an inference gateway, stopping all traffic routing through it.
Delete Vpc
Permanently deletes a Virtual Private Cloud network boundary.
Get Cluster
Retrieves specific details about an existing Kubernetes cluster.
List Capacity Claims
Lists all active requests for inference capacity.
List Clusters
Retrieves a list of all managed Kubernetes clusters.
List Deployments
Shows a catalog of all currently running model deployments.
List Gateways
Lists every inference gateway configured for traffic routing.
List Vpcs
Retrieves a summary of all existing network VPCs.
Query Logs
Queries historical logs from the system for debugging purposes.
Query Metrics
Retrieves performance metrics data points (e.g., CPU, GPU utilization).
Update Capacity Claim
Modifies the size or parameters of an existing capacity claim.
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with CoreWeave (AI GPU Cloud), then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,100+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by CoreWeave. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This connection provides 24 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.
Managing AI infrastructure means juggling too many moving parts.
Right now, spinning up an ML environment is a nightmare of tabs. You have to jump into the cloud console to set up the network boundaries (the VPC), then spin off another toolset just for compute clusters, and finally configure traffic routing with dedicated gateway tools. It's slow, it takes copy-pasting IDs everywhere, and if you miss one step, your whole deployment fails.
With this MCP, you talk to your agent like talking to a teammate. You simply tell it what the final goal is—say, 'I need production access for Model Y.' The system handles the sequence: creating the VPC, provisioning the cluster, setting up the gateway, and deploying the model. You just get the functional endpoint.
You Get Full Control Over Your Infrastructure With CoreWeave MCP
The tedious manual steps of creating a network boundary, setting up resource quotas, and then linking them to an inference deployment all disappear. You don't manage the state transitions; the agent does.
You gain reliable automation across your entire stack. It’s not just about running commands; it’s about guaranteeing that every component—from `create_vpc` to `update_gateway`—is configured correctly and in order.
What your AI can actually do with this
Need to run a big ML job or deploy a new model? This MCP lets you manage all the underlying hardware—the GPU compute power and networking—directly through conversation. You can tell your agent to set up an isolated network for your test models, then spin up a dedicated cluster optimized for training, and finally route traffic to the finished service using inference gateways.
It’s about making sure your entire AI stack runs reliably at scale. The process covers everything from creating a VPC to monitoring resource usage; you just talk to it. This capability is hosted on Vinkius, giving you access to this essential cloud control panel right alongside hundreds of other specialized MCP tools.
019e5d0c-b044-7059-bb3b-b8a390f1b41f Here's how it actually works
The bottom line is you get full control over complex cloud infrastructure by simply talking to it.
Connect your CoreWeave account credentials to the MCP via your AI client.
Use natural language prompts to define necessary resources, such as creating a VPC or listing existing clusters.
The agent executes the required sequence of calls and returns structured data detailing the deployed infrastructure.
Who is this actually for?
ML Engineers and DevOps teams use this when they need to move models from a local sandbox into production-grade, scalable cloud environments. They're the ones who get tired of clicking through four different dashboards just to verify network connectivity or check cluster status.
Uses this to provision and scale GPU clusters for model training without leaving their coding environment.
Automates the setup of VPCs and inference gateways, ensuring production AI services have rock-solid networking.
Quickly inspects cluster status and deployment health to diagnose failures during model testing phases.
What Changes When You Connect
You don't have to manually jump between consoles. You can ask your agent to list all clusters, then check metrics for a specific deployment, and finally create a new VPC—all in one chat session.
The toolset gives you full lifecycle control. Need to scale? Use update_cluster. Finished with a test environment? Run delete_vpc to clean up everything, preventing cloud waste.
Debugging is faster. If an endpoint fails, use query_logs and query_metrics immediately after listing the deployments via list_deployments. You get context instantly.
Security configuration becomes simple. Instead of manual firewall rules, you just tell your agent to create_vpc, guaranteeing network isolation for sensitive models.
Deployment complexity drops. From creating a new VPC to setting up a gateway and deploying the model, every step is handled with one set of commands.
See it in action
The Model Promotion Pipeline
A research team finishes training a model on an existing cluster. They need to move it into production. The agent first uses list_clusters to find the target environment, then calls create_deployment using the new model artifact, and finally establishes traffic via create_gateway. It’s fully automated.
Network Audit and Compliance
Compliance requires proving that all development environments are isolated. The agent uses list_vpcs to map out every network segment, then calls get_vpc on each one to confirm the required security rules are in place.
Post-Mortem Failure Analysis
A live AI service suddenly slows down. The ops engineer uses list_deployments to find the service, then immediately runs query_metrics and get_cluster details to diagnose if the bottleneck is compute or network related.
Environment Teardown
A project concludes, leaving behind numerous resources. The agent systematically uses delete_vpc, delete_gateway, and delete_cluster to ensure zero lingering costs and a clean cloud slate.
The honest tradeoffs
Manual Scripting for State Changes
Writing long shell scripts that chain together commands like 'create VPC, then create cluster, then update gateway.' If one command fails halfway through, the whole thing breaks and requires manual rollback.
Use your agent to orchestrate the entire process. Instead of writing a script, tell it: 'Set up a new production environment for Model X.' The MCP handles the complex sequencing, ensuring that if create_gateway fails, it can attempt cleanup.
Ignoring Dependencies
Trying to update an inference deployment (update_deployment) before confirming its underlying cluster is stable or available. You'll get vague errors and waste time.
First, run get_cluster on the target resource. Verify its status is 'Active.' Then, proceed with update_deployment. Always check the state first.
Over-provisioning Resources
Creating a VPC and cluster just to test a single small model, leading to unnecessary cost and complexity.
Always start small. Use create_capacity_claim for initial testing instead of immediately creating full clusters. Scale up only after verifying stability.
When It Fits, When It Doesn't
Use this MCP if your job requires managing the entire lifecycle of complex, interdependent AI infrastructure: VPCs must talk to Clusters which host Deployments accessed by Gateways. Don't use it if you just need a simple API call or want to manage a single resource in isolation (like checking one log entry). For simple tasks like fetching logs, use query_logs directly; don't initiate a full cluster setup process. If your workflow involves creating and managing multiple interconnected components, this MCP handles the sequencing for you.
Questions you might have
How do I list my active clusters using the list_clusters tool? +
Use the agent to call list_clusters. This immediately returns a comprehensive catalog of all managed Kubernetes clusters, letting you know exactly what compute power you have available.
What is the difference between create_vpc and update_vpc? +
Use create_vpc when you need a new network boundary. Use update_vpc only if you need to change an existing VPC's parameters, like widening its IP range.
Should I use list_gateways or get_gateway first? +
You should always run list_gateways first. This gives you a high-level view of all your entry points; then, if you need details on one, you can ask the agent to fetch it using the specific ID.
Does create_deployment handle scaling? +
The create_deployment tool sets up the endpoint. If you need to scale it later due to increased traffic, use update_deployment to adjust its capacity instead.
When should I use `create_vpc` versus just relying on my existing network setup? +
You create a VPC when you need strict, isolated networking for specific compute resources. This ensures your GPU nodes and services are separated from other traffic by a defined CIDR block. It’s the first step if security isolation is your top priority.
I'm debugging performance issues; what kind of data do I get when running `query_metrics`? +
The query returns detailed Prometheus metrics, allowing you to monitor real-time resource utilization and latency. You can check specific endpoints or aggregate usage across your cluster fleet to identify bottlenecks.
If a project finishes, what’s the proper sequence for cleanup using `delete_cluster`? +
You must delete resources in reverse order: first gateways and deployments, then the VPC, and finally the cluster itself. Deleting everything systematically prevents orphaned network rules or billing issues.
Do I need to run `list_capacity_claims` before attempting to update a deployment? +
It's smart practice to list claims first. This lets you verify your current resource reservation status and ensure that the updated deployment still falls within an available, budgeted capacity claim.
Can I list all my active Kubernetes clusters across the CoreWeave infrastructure? +
Yes. By using the list_clusters tool, your agent will retrieve a complete list of all bare-metal Kubernetes clusters (CKS) managed under your account.
How do I check the specific network configuration of a VPC? +
You can use the get_vpc tool by providing the specific VPC ID. The agent will return detailed information about network isolation and configuration for that resource.
Is it possible to create a new inference gateway via the AI agent? +
Absolutely. Use the create_gateway tool with the required specification JSON. This allows you to set up routing and authentication for your AI model traffic programmatically.
We've already built the connector for CoreWeave. Just plug in your AI agents and start using Vinkius.
No hosting. No infrastructure. No complex setup.
All 24 tools are live and waiting.
You're up and running in seconds.
Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.
Built, hosted, and secured by Vinkius. You just connect and go.