# Vast.ai GPU MCP

> Vast.ai GPU Rental Cloud API connects your AI client to the world's largest marketplace for high-performance GPUs. Use this server to search hardware like RTX 4090 or A100, spin up Docker containers instantly, and manage the entire compute lifecycle—from deployment to termination—without leaving your chat window.

## Overview
- **Category:** cloud-infrastructure
- **Price:** Free
- **Tags:** gpu-rental, deep-learning, docker-deployment, cloud-computing, ai-infrastructure

## Description

**Vast.ai GPU Rental API - Manage Cloud Compute**

Look, forget navigating some clunky web console just to spin up a few GPUs for your deep learning model. This server connects your AI client straight into Vast.ai's massive marketplace. You can search for hardware like an RTX 4090 or an A100, deploy containers in seconds, and manage the whole compute life cycle—from when you start paying to when you shut it down—all without leaving your chat window.

### What This Server Does

**`search_offers`** lets you query the marketplace. You give it specific hardware criteria using JSON format, and it spits back available GPUs and their current pricing structure. You ain't gotta guess what's out there; you just tell your agent what specs you need, and it checks all the offers for you.

Need to know if that A100 you want is actually available today? Run `search_offers` with your criteria. It tells you exactly which models are listed right now and how much they're costing per hour. This keeps you from wasting time looking at dead ends or deals that went stale five minutes ago.

When you find the perfect setup, **`rent_instance`** handles the deployment. You tell it the specific offer ID you want to use, what container image your code needs (like PyTorch or TensorFlow), and how much disk space you require for your data sets. It immediately initiates the process of renting a full GPU instance on Vast.ai. You don't click buttons; you just send the command, and your compute environment is getting spun up.

Once it's live, **`list_instances`** gives you the rundown. You can list every single GPU instance you currently have rented—whether they're running hard right now or paused in a holding pattern. This tool shows you their live status, how much they cost per hour based on your usage time, and all the necessary network details like IP addresses. It’s your central dashboard for keeping tabs on your whole fleet.

When your job is done—and it's always done—**`delete_instance`** steps in. You run this command, specifying the exact GPU instance you wanna kill, and it immediately terminates and deletes that resource. This isn't just a suggestion; it physically stops the billing cycle for that machine. It keeps your environment clean and makes sure you don't get accidentally charged when you walked away from your laptop.

You use `search_offers` to find what you need, then you use `rent_instance` to bring it online with its specific container image and disk size requirements. Once everything is running, you check the status of all your machines using `list_instances`. When the work’s done, you hit it with `delete_instance` to shut down the whole thing immediately.

This system means you never have to switch contexts or log into a separate portal. You keep doing your research and coding right here, letting your agent manage the entire hardware lifecycle from finding an offer to powering down the machine. It’s fast. It's clean. And it saves you cash.

## Tools

### delete_instance
Stops and deletes a specific GPU instance you have rented on Vast.ai.

### list_instances
Retrieves a list of all your currently active or paused GPU instances on Vast.ai.

### rent_instance
Initiates the process of renting a new, live GPU instance on Vast.ai using defined parameters.

### search_offers
Queries the marketplace to find available GPUs and their current pricing by providing specific hardware criteria in JSON format.

## Prompt Examples

**Prompt:** 
```
Search for available RTX 4090 GPU offers on Vast.ai.
```

**Response:** 
```
I've found several RTX 4090 offers. The cheapest is Offer ID 728391 at $0.42/hr with 24GB VRAM. Would you like to see more details or rent this one?
```

**Prompt:** 
```
Rent offer 728391 using the 'pytorch/pytorch' image and 50GB of disk.
```

**Response:** 
```
Success! I have initiated the rental for Offer ID 728391. Your new Instance ID is 1029384. It is currently spinning up with the PyTorch image.
```

**Prompt:** 
```
List all my current active instances on Vast.ai.
```

**Response:** 
```
You have one active instance (ID: 1029384) running an RTX 4090. Status: 'running', IP: 123.45.67.89, Cost: $0.42/hr.
```

## Capabilities

### Discover GPU Offers
Search the Vast.ai marketplace using specific hardware names or criteria to find available pricing and offers.

### Provision Compute Instances
Rent a full GPU instance by selecting an offer ID, specifying a container image, and setting disk size.

### Manage Active Fleet Status
List all current instances you've rented to check their live status, cost per hour, and network details.

### Decommission Resources
Terminate and delete running GPU instances immediately when your work is finished.

## Use Cases

### Training a New LLM Model
The ML Engineer needs an A100 GPU for training. Instead of manually checking the Vast.ai dashboard, they tell their agent: 'Search for A100s and rent one using the latest PyTorch image.' The agent runs `search_offers`, gets the cheapest ID, and executes `rent_instance`. When done, it calls `delete_instance` to stop billing.

### Comparing GPU Performance
A Data Scientist needs to test three different hardware types (RTX 4090 vs. A100). They use `search_offers` four times, noting the price and VRAM for each. Then, they deploy a small container on each using `rent_instance` so they can benchmark them side-by-side before committing to a full deployment.

### Monitoring Stale Resources
The DevOps team finishes a weekend test run but forgets which instances are still active. They simply ask their agent: 'What GPUs am I currently running?' The agent runs `list_instances`, showing them the IDs and statuses, allowing them to immediately call `delete_instance` on anything forgotten.

### Debugging a Deployment Failure
A service fails because its GPU environment is unstable. Instead of logging into multiple portals, the engineer asks their agent to list all running jobs via `list_instances`. This reveals the instance ID and status (e.g., 'failed'), allowing them to target that specific resource with a cleanup command.

## Benefits

- Speed: Instead of navigating multiple web consoles to find the best price-to-performance hardware, `search_offers` finds it instantly. You get immediate GPU discovery right where you are working.
- Efficiency: Deploying a container is simple. Using `rent_instance`, you tell your agent exactly what image and disk size you need, spinning up a functional environment in seconds.
- Visibility: Never wonder if your job is still running or what it costs. `list_instances` gives you real-time status, IP addresses, and cost tracking for all active deployments.
- Cost Control: The biggest win is cleanup. When the task finishes, run `delete_instance`. This ensures you stop paying immediately, avoiding expensive orphaned resources.
- Focus: Your AI client handles the API calls—the complex JSON queries, ID management, and lifecycle commands—so you just focus on your code.

## How It Works

The bottom line is: your AI client handles all API calls—searching for hardware, deploying containers, and cleaning up after itself—all through natural conversation.

1. Subscribe to the server and provide your Vast.ai API Key.
2. Ask your AI client to execute a search using `search_offers` (e.g., 'Find RTX 4090 offers').
3. Once you select an Offer ID, tell the agent to run `rent_instance` with the required Docker image and disk size.

## Frequently Asked Questions

**How do I find the best GPU price using search_offers?**
The agent uses `search_offers` with a JSON query like {"gpu_name":{"eq":"RTX 4090"}}. The results provide multiple offers, letting you compare prices and VRAM to pick the optimal one.

**What information does list_instances give me?**
`list_instances` gives you a summary of all your rented GPUs. You get the Instance ID, current status (running/paused), IP address, and hourly cost estimate for each one.

**Can I rent an instance without knowing the specific offer ID?**
No. You must first use `search_offers` to find a valid Offer ID from the marketplace before you can tell the agent to run `rent_instance`. The ID ties your deployment to a live source.

**Is calling delete_instance permanent?**
Yes, calling `delete_instance` terminates the compute job and frees up the resource immediately. This stops billing for that specific instance ID.

**How does the `search_offers` tool validate my API Key?**
The server uses your provided Vast.ai API Key for all operations. If the key is invalid or lacks permission to view offers, the tool immediately returns an authentication error code. Always verify your credentials first.

**When using `rent_instance`, what are the technical requirements for the Docker image?**
You must provide a valid container registry path for the specified Docker image (e.g., PyTorch or TensorFlow). The system uses this path to pull and deploy your compute environment.

**Can I refine my GPU search using `search_offers` beyond just the hardware name?**
Yes, you can structure the JSON query to include multiple constraints, such as minimum VRAM or maximum hourly cost. You need to provide a detailed JSON object for advanced filtering.

**If I see an old instance via `list_instances`, what is the fastest way to stop charges?**
Calling `delete_instance` terminates the compute resources instantly. This action stops all usage and prevents further billing from Vast.ai immediately.

**How can I find a specific GPU model like an RTX 4090?**
Use the `search_offers` tool with a query like `{"gpu_name": {"eq": "RTX 4090"}}`. The agent will return a list of available offers matching that hardware.

**What information do I need to rent a new GPU instance?**
You need an `offer_id` (from search results) and a Docker `image` name (e.g., 'pytorch/pytorch'). You can also optionally specify the `disk` size in GB using the `rent_instance` tool.

**How do I stop an instance to avoid further charges?**
Simply use the `delete_instance` tool with the specific `instance_id`. This will terminate the instance and release the GPU back to the marketplace.