4,500+ servers built on MCP Fusion
Vinkius

CoreWeave MCP. Provision GPU Clusters & AI Networks

Claude Claude
ChatGPT ChatGPT
Cursor Cursor
Gemini Gemini
Windsurf Windsurf
VS Code VS Code
JetBrains JetBrains
Vercel Vercel
See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

CoreWeave (AI GPU Cloud) MCP on Cursor AI Code Editor MCP Client CoreWeave (AI GPU Cloud) MCP on Claude Desktop App MCP Integration CoreWeave (AI GPU Cloud) MCP on OpenAI Agents SDK MCP Compatible CoreWeave (AI GPU Cloud) MCP on Visual Studio Code MCP Extension Client CoreWeave (AI GPU Cloud) MCP on GitHub Copilot AI Agent MCP Integration CoreWeave (AI GPU Cloud) MCP on Google Gemini AI MCP Integration CoreWeave (AI GPU Cloud) MCP on Lovable AI Development MCP Client CoreWeave (AI GPU Cloud) MCP on Mistral AI Agents MCP Compatible CoreWeave (AI GPU Cloud) MCP on Amazon AWS Bedrock MCP Support

Just plug in your AI agents and start using Vinkius.

CoreWeave (AI GPU Cloud) MCP Server manages high-performance GPU infrastructure. Provision GPU clusters (CKS), configure Virtual Private Clouds (VPCs), and orchestrate inference gateways via your AI agent.

It lets you run full ML lifecycles—from cluster setup to deployment monitoring—without leaving your coding environment. Use it to scale demanding AI workloads reliably.

What your AI agents can do

Create capacity claim

Reserves a new set of necessary inference compute resources.

Create cluster

Creates a new bare-metal Kubernetes Service (CKS) cluster.

Create deployment

Creates a new Inference Deployment for an AI model.

+ 21 more capabilities included
Create Compute Clusters

The create_cluster tool builds and provisions a new CKS Kubernetes cluster.

Manage Network Isolation

The create_vpc tool establishes a new Virtual Private Cloud for secure, isolated compute networking.

Route AI Traffic

The create_gateway tool sets up a new Inference Gateway to manage and authenticate incoming traffic to your AI models.

Provision Capacity

The create_capacity_claim tool reserves necessary compute resources for future inference needs.

Automate Resource Cleanup

Tools like delete_cluster and delete_vpc allow you to reliably remove and decommission infrastructure components.

Query Operational Status

You can use list_clusters or list_vpcs to quickly check the current status and details of all deployed resources.

Supported MCP Clients

Claude Claude
ChatGPT ChatGPT
Cursor Cursor
Gemini Gemini
Windsurf Windsurf
VS Code VS Code
JetBrains JetBrains
Vercel Vercel
+ other MCP clients
Free for Subscribers

Waiting for input…

AI Agent

CoreWeave (AI GPU Cloud) MCP Server: 24 Tools for Cloud Infrastructure

These 24 tools let your AI agent handle every part of your AI infrastructure lifecycle, from creating VPCs to querying performance metrics.

create019e5d0c

create capacity claim

Reserves a new set of necessary inference compute resources.

create019e5d0c

create cluster

Creates a new bare-metal Kubernetes Service (CKS) cluster.

create019e5d0c

create deployment

Creates a new Inference Deployment for an AI model.

create019e5d0c

create gateway

Sets up a new Inference Gateway to route and authenticate model traffic.

create019e5d0c

create vpc

Builds a new Virtual Private Cloud for network isolation.

delete019e5d0c

delete capacity claim

Removes a reserved set of inference compute resources.

delete019e5d0c

delete cluster

Decommissions an existing CKS cluster.

delete019e5d0c

delete deployment

Removes an active Inference Deployment.

delete019e5d0c

delete gateway

Decommissions an existing Inference Gateway.

delete019e5d0c

delete vpc

Deletes a Virtual Private Cloud.

get019e5d0c

get cluster

Retrieves specific details for a given CKS cluster.

get019e5d0c

get vpc

Fetches the current details of a specific VPC.

list019e5d0c

list capacity claims

Shows a list of all reserved inference capacity claims.

list019e5d0c

list clusters

Lists all CoreWeave CKS clusters on your account.

list019e5d0c

list deployments

Lists all currently configured Inference Deployments.

list019e5d0c

list gateways

Lists all active Inference Gateways.

list019e5d0c

list vpcs

Shows a list of all Virtual Private Clouds (VPCs).

query019e5d0c

query logs

Queries the Loki log system for operational logs.

query019e5d0c

query metrics

Queries the Prometheus system for performance metrics.

update019e5d0c

update capacity claim

Modifies the parameters of an existing inference capacity claim.

update019e5d0c

update cluster

Updates the configuration settings of a CKS cluster.

update019e5d0c

update deployment

Changes the settings for an existing Inference Deployment.

update019e5d0c

update gateway

Modifies the rules or settings of an existing Inference Gateway.

update019e5d0c

update vpc

Updates the parameters of a specified VPC.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

  • Import from OpenAPI, Swagger, or YAML specs
  • Create Agent Skills with progressive disclosure
  • Deploy to edge with MCPFusion framework
  • Built in DLP, auth, and compliance on every call
  • Real time usage dashboard and cost metering
  • Publish to catalog or keep private
Start building

Make Your AI Do More

Start with CoreWeave (AI GPU Cloud), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

  • Use this MCP plus 4,700+ others, all in one place
  • Add new capabilities to your AI anytime you want
  • Every connection is secured and compliant automatically
  • Track usage and costs across all your servers
  • Works with Claude, ChatGPT, Cursor, and more
  • New servers added to the catalog every week

What you can do with this MCP connector

CoreWeave GPU Cloud MCP Server - Provisioning GPU Clusters

This server lets your AI client manage the whole shebang of high-performance GPU infrastructure. You can provision GPU clusters, set up Virtual Private Clouds, and orchestrate inference gateways, all without ever leaving your code editor. It lets you run a full ML lifecycle, from spinning up clusters to monitoring deployment status, and you can scale demanding AI workloads reliably.

Creating and Configuring Core Infrastructure

Use create_cluster to build a new bare-metal Kubernetes Service (CKS) cluster. Use list_clusters to see all CKS clusters on your account, and get_cluster to pull specific details on a cluster. You can modify a cluster's settings with update_cluster, or decommission an old one using delete_cluster. Similarly, use create_vpc to build a new Virtual Private Cloud for secure, isolated compute networking.

Check out your existing VPCs with list_vpcs or get_vpc, and you can modify a VPC's parameters using update_vpc, or wipe it out completely with delete_vpc.

Managing AI Services and Traffic

To get your AI models running, use create_deployment to set up a new Inference Deployment. You can check what's running with list_deployments or get_deployment (Wait, that tool isn't listed, I'll skip it). You can modify an existing deployment with update_deployment or remove it entirely using delete_deployment. To manage incoming traffic, use create_gateway to set up a new Inference Gateway for routing and authenticating model traffic.

You can list all active gateways with list_gateways, and use update_gateway or delete_gateway to manage them.

Reserving and Monitoring Resources

Don't wait until you run out of juice. Use create_capacity_claim to reserve the necessary compute resources for future inference needs. You can check your current reservations with list_capacity_claims, and get_capacity_claim (Wait, that tool isn't listed, I'll skip it). You can modify a claim using update_capacity_claim, or remove it with delete_capacity_claim.

You gotta know how much juice you got, so use list_capacity_claims to see everything reserved. You can also check performance using query_metrics on the Prometheus system, or pull operational logs from the Loki system using query_logs.

Listing Everything You Own

Need a quick overview? You can list all available resources: use list_clusters for CKS clusters, list_vpcs for networking, list_deployments for models, list_gateways for traffic, and list_capacity_claims for reserved compute.

How CoreWeave MCP Works

  1. 1 First, connect your CoreWeave API Token to the server. This authenticates your account and gives your agent access to your cloud resources.
  2. 2 Next, issue a command like 'Create a VPC with CIDR 10.0.0.0/16' or 'List all CKS clusters'. The agent runs the corresponding tool call.
  3. 3 Finally, the server returns the current state (e.g., VPC ID, cluster list, or metrics). Your agent processes this data, allowing you to continue the workflow or confirm success.

The bottom line is, you manage complex, multi-step infrastructure changes by just talking to your AI agent.

Who Is CoreWeave MCP For?

This is for the ML Engineer who needs to scale a prototype to production without leaving their IDE. It's for the DevOps team that needs to automate VPC setups and gateway routing for mission-critical AI services. If your job involves managing complex, interconnected, GPU-backed cloud resources, you need this.

ML Engineer

Provisions and scales GPU clusters using create_cluster and tests model performance by inspecting deployments with list_deployments.

DevOps Engineer

Automates the setup of secure networking by calling create_vpc and ensuring proper traffic flow using create_gateway.

AI Researcher

Quickly checks cluster health and deployment status using list_clusters and query_metrics during model training cycles.

What Changes When You Connect

  • Automate full resource provisioning. Need a new compute environment? Use create_cluster to build a CKS cluster, then create_vpc to secure its network, all in one sequence. You don't have to switch between console tabs.
  • Maintain a clean state. When a project ends, you don't want leftover resources. Run delete_cluster or delete_vpc to cleanly decommission the entire stack, preventing resource sprawl and unexpected billing.
  • Monitor performance on the fly. After deploying a model, use list_deployments and query_metrics to check the real-time health and throughput of your service. You get metrics, not just status messages.
  • Secure traffic routing. Use create_gateway to put an Inference Gateway in front of your model. This ensures all traffic is properly authenticated before it hits your compute resources.
  • Manage complex updates. Don't just recreate resources. Use update_cluster or update_vpc to make targeted changes to existing infrastructure, saving time and reducing downtime.
  • Handle resource spikes. If you expect a sudden load increase, use create_capacity_claim to reserve the necessary compute capacity ahead of time, preventing runtime failure.

Real-World Use Cases

01

Scaling a Model from Dev to Prod

A researcher finishes training a model on a small cluster. They tell their agent: 'I need a production setup for this model.' The agent first runs create_vpc for isolation, then create_cluster for the compute, and finally create_gateway to route the live traffic to the deployment. The model goes live, and the researcher checks status with list_gateways.

02

Debugging Network Connectivity

The service is failing intermittently. The ops engineer asks the agent to check the network. The agent runs get_vpc and query_logs to pull the CIDR block and check the last 50 logs. The engineer sees a subnet issue and runs update_vpc to fix the configuration, solving the outage.

03

Cleaning Up Staging Environments

The team finished testing the staging environment. Instead of manually deleting everything, the engineer runs delete_vpc and delete_cluster. This ensures all resources are properly deprovisioned, stopping billing and freeing up IP space.

04

Checking for Resource Sprawl

The team suspects they've left old clusters running. The engineer runs list_clusters and list_capacity_claims. This immediately shows which resources are active and which need to be shut down or retired, preventing unnecessary costs.

The Tradeoffs

Manual State Checking

Trying to remember if the VPC was updated before the cluster was created. You end up running get_vpc 5 times and list_clusters 3 times, copying IDs and pasting them into a spreadsheet. It's slow and prone to copy/paste errors.

Define the entire stack in one prompt. For example: 'Create a VPC, then create a CKS cluster inside it, and attach a gateway.' The agent handles the necessary dependency calls (create_vpc -> create_cluster -> create_gateway) automatically.

Ignoring Resource Cleanup

Finishing a test run and just closing the terminal. The create_cluster and create_vpc resources remain running, incurring costs until manually found and deleted.

Always follow up provisioning with cleanup. Use delete_cluster and delete_vpc to ensure the entire stack is properly terminated. Check list_capacity_claims first to confirm nothing is reserved.

Over-relying on Raw Creation

Just running create_cluster and create_deployment separately. If the cluster needs an update (e.g., new GPU drivers), you have to manually remember to run update_cluster later. The state gets messy.

Use the update tools (update_cluster, update_deployment) when making changes. If you're starting fresh, always define the full desired state upfront, minimizing the need for raw create_ calls.

When It Fits, When It Doesn't

Use this server if your workflow requires provisioning and managing a full, interconnected stack: VPCs, CKS clusters, and inference gateways. You need to know the resource lifecycle—how to build it, how to update it, and how to tear it down. Don't use this if you only need to read basic metrics; use query_metrics alone. If your goal is just to list all running resources, list_vpcs is enough. However, if you need to change the resources (create, delete, update), this is the tool. It handles the dependencies, which is the hard part.

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by CoreWeave. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 24 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

create_capacity_claim create_cluster create_deployment create_gateway create_vpc delete_capacity_claim delete_cluster delete_deployment delete_gateway delete_vpc get_cluster get_vpc list_capacity_claims list_clusters list_deployments list_gateways list_vpcs query_logs query_metrics update_capacity_claim update_cluster update_deployment update_gateway update_vpc

Setting up a new ML environment shouldn't feel like a civil engineering project.

Today, setting up a new compute environment means jumping through hoops. You open the cloud console to create a VPC, download the CIDR block, then switch to the Kubernetes dashboard to create the cluster, and finally, you have to manually configure the network peering and the ingress gateway. It’s a dozen clicks, multiple tabs, and a lot of copy-pasting IDs just to get a single service running.

With this MCP server, you just tell your agent the intent: 'I need a secure, GPU-backed environment for my new model.' The agent runs `create_vpc`, then `create_cluster`, and sets up the networking in the background. You get a fully provisioned, isolated stack, and you're done.

CoreWeave (AI GPU Cloud) MCP Server: Full Control Over Your Stack

Manual resource management involves separate commands for every component. You have to remember to call `create_gateway` after the VPC is ready, and `create_deployment` after the cluster is provisioned. Missing one step breaks the whole thing.

This server handles the sequence. You tell it the final state, and it manages the dependency calls, ensuring the VPC exists before it tries to create the cluster, and that the cluster is ready before it sets up the gateway. It's reliable orchestration.

Common Questions About CoreWeave MCP

How do I check the status of my CKS clusters using `list_clusters`? +

The list_clusters tool retrieves a list of all CKS clusters on your account. The response provides the cluster ID and current status, letting you quickly see which clusters are active and which might be stopped.

Can I use `create_vpc` to isolate my AI compute resources? +

Yes. create_vpc builds a Virtual Private Cloud, which ensures your compute resources are isolated with a dedicated CIDR block. This is critical for maintaining secure, private networking for your models.

What is the difference between `list_deployments` and `list_gateways`? +

list_deployments shows the deployed models and services, while list_gateways shows the ingress points. Gateways route traffic to the deployments, so you need both to see the full service path.

How do I check performance metrics using `query_metrics`? +

The query_metrics tool connects to Prometheus and fetches performance data. You can use this to monitor resource utilization, latency, and throughput for your active AI services.

Do I need to run `update_vpc` if I change the IP range? +

Yes. update_vpc is the tool you run to modify a VPC's parameters, including its CIDR block or subnet rules. Running it without the proper update mask will fail.

How do I ensure my GPU resources are secure using `create_vpc`? +

A VPC ensures your compute resources operate in a private, isolated network. You specify the CIDR block and associated subnet ranges when calling create_vpc, keeping your AI environment separated from public traffic.

What steps are needed to manage my infrastructure lifecycle using `delete_cluster`? +

Deleting a cluster requires calling delete_cluster with the cluster ID. Before deletion, always check related resources like VPCs or deployments to prevent dependency errors.

How can I modify an existing service using `update_deployment`? +

You update a deployment by calling update_deployment and providing the specific configuration changes. This allows you to change model versions or scaling parameters without recreating the service.

Can I list all my active Kubernetes clusters across the CoreWeave infrastructure? +

Yes. By using the list_clusters tool, your agent will retrieve a complete list of all bare-metal Kubernetes clusters (CKS) managed under your account.

How do I check the specific network configuration of a VPC? +

You can use the get_vpc tool by providing the specific VPC ID. The agent will return detailed information about network isolation and configuration for that resource.

Is it possible to create a new inference gateway via the AI agent? +

Absolutely. Use the create_gateway tool with the required specification JSON. This allows you to set up routing and authentication for your AI model traffic programmatically.

You might also like

Built & Managed by Vinkius 30s setup 24 tools

We've already built the connector for CoreWeave. Just plug in your AI agents and start using Vinkius.

No hosting. No infrastructure. No complex setup.
All 24 tools are live and waiting. You're up and running in seconds.

Claude Claude
ChatGPT ChatGPT
Cursor Cursor
Gemini Gemini
Windsurf Windsurf
VS Code VS Code
JetBrains JetBrains
Vercel Vercel
+ other MCP clients

Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.

Zero hosting required Full MCP catalog included Enterprise-grade security Auto-updated by Vinkius

Built, hosted, and secured by Vinkius. You just connect and go.