# RunPod MCP

> RunPod MCP lets your AI agent act like a DevOp engineer right inside your chat window. You can provision GPU pods, check active instances, and manage serverless endpoints for compute-intensive tasks without touching a dashboard. It’s instant infrastructure control.

## Overview
- **Category:** superpower
- **Price:** Free
- **Tags:** gpu-computing, serverless-deployment, cloud-instances, machine-learning-ops, container-orchestration, infrastructure-as-code

## Description

Need to run heavy machine learning models or complex computational workflows? This MCP connects your agent directly to RunPod, giving it full command over scalable GPU computing resources. Instead of logging into a cloud console and clicking through menus to get what you need, you just ask for it. You tell the system to spin up a specific type of hardware or check if any current pods are running idle—and it handles the rest. It’s about treating infrastructure management like another natural language task. By connecting this RunPod MCP through Vinkius, your agent gains immediate access to professional-grade DevOp tools. You can audit all serverless endpoints and quickly provision new hardware nodes right from a simple conversation. It means you get the power of complex cloud orchestration without leaving your chat interface.

## Tools

### create_pod
This tool builds a new GPU pod by specifying the desired name, type of GPU hardware, and Docker image.

### get_pod
It pulls up specific details for one particular GPU pod you want to check on.

### list_endpoints
The agent compiles a list of every registered serverless endpoint in the account.

### list_gpu_types
It shows you all the different GPU hardware types that are currently available for deployment.

### list_pods
This tool generates a comprehensive list of every GPU pod in your account.

### list_templates
It retrieves all the saved pod templates you've configured previously.

### stop_pod
You use this to halt a running GPU pod instance, which cuts off billing for that specific compute target.

## Prompt Examples

**Prompt:** 
```
Show me our stopped GPU pods.
```

**Response:** 
```
I successfully verified the RunPod platform logs. You have 2 pods currently in a paused stopped status in your configured account.
```

**Prompt:** 
```
Check what GPU templates are available to deploy a new Llama-3 inference instance.
```

**Response:** 
```
I have loaded the RunPod catalog template arrays. There are several pre-built images with focused PyTorch and vLLM installations tuned perfectly for Llama-3 text deployments. Would you like me to provision one specific GPU?
```

**Prompt:** 
```
Pause pod with ID 'pod_xyz_980' immediately to prevent recurring costs throughout the evening.
```

**Response:** 
```
Pod 'pod_xyz_980' has been carefully stopped securely. Active hourly billing operations to compute cycles for this specific cloud target are halted.
```

## Capabilities

### Provisioning New Hardware
You instruct the system to build entirely new GPU pods using specified types and Docker images.

### Managing Running Instances
You check details for specific pods or halt running instances immediately to prevent unnecessary billing costs.

### Inventorying Resources
The agent lists every active pod, available GPU hardware type, and saved deployment template in your account.

### Auditing Deployments
You review all registered serverless endpoints that are routing containerized inference applications.

## Use Cases

### Stopping an expensive test run
An ML developer finishes testing a model locally but forgot the remote pod is still running. They prompt their agent: 'Pause pod with ID X immediately to prevent recurring costs.' The agent uses `stop_pod` and confirms billing operations are halted.

### Deploying a new service version
A DevOps engineer needs to test a containerized inference app. They ask the agent to list available templates, use `list_templates`, select the right one, and then run `create_pod` with minimal effort.

### Checking hardware capacity
A team lead needs to know what kind of GPU power is even possible. They ask the agent which types are available, triggering a call to `list_gpu_types` so they can plan their next big deployment.

### Reviewing current deployed services
A developer wants to see every live API connection point for their app. They instruct the agent to check all endpoints, using `list_endpoints`, ensuring nothing critical has been forgotten or misconfigured.

## Benefits

- Control costs instantly: You can use the `stop_pod` tool to halt running instances immediately, preventing unexpected hourly charges.
- Provision hardware on demand: Instead of browsing menus, you simply ask your agent to create a new pod using `create_pod`, specifying exactly what GPU type and image you need.
- Audit infrastructure easily: Use `list_endpoints` to review every single serverless endpoint connected to your application without opening any management consoles.
- Understand options quickly: If you’re unsure what hardware to use, the agent can run `list_gpu_types` to show all available GPU architectures for deployment.
- Manage templates efficiently: Need a standard setup? Use `list_templates` to see your saved configurations and reference them when provisioning new resources.

## How It Works

The bottom line is, you get chat-based control over mission-critical cloud resources.

1. First, enable the RunPod orchestration integration inside your core interface.
2. Next, sign into your RunPod cloud console and generate a new API Key with Read/Write permissions; insert this key into the secure connection module below.
3. Finally, ask your agent to perform an action, like 'List all active GPU pods and point out any that are sitting idle without active usage.'

## Frequently Asked Questions

**How do I use the RunPod MCP to provision hardware?**
You instruct your agent using `create_pod`. You'll need to specify the name you want for the pod, the GPU type, and the Docker image. The agent handles building the instance for you.

**Can I use RunPod MCP to stop a running pod?**
Yes. If you need to halt an expensive running pod immediately, just ask your agent to run `stop_pod` with the specific ID. This is key for controlling costs.

**Does RunPod MCP list all my templates?**
Absolutely. Use the `list_templates` tool name to see every saved pod template you have configured, helping you reuse successful setups quickly.

**What if I need a different type of GPU? How do I find it using RunPod MCP?**
You can check all available options by running `list_gpu_types`. This gives you the definitive list of hardware types that your agent can use for provisioning.

**Is RunPod MCP only good for LLMs?**
No. While great for LLMs, it handles all computational workloads—from general ML training to running any containerized inference application via `list_endpoints`.