# Baseten MCP for AI Agents MCP

> Baseten connects your AI agents directly to your machine learning infrastructure. Your agent can now manage entire model lifecycles—from listing deployed models to running real-time predictions on GPU weights and auditing sensitive workspace secrets.

## Overview
- **Category:** ai-frontier
- **Price:** Free
- **Tags:** model-deployment, inference-api, serverless-ml, model-scaling, mlops

## Description

This MCP lets you treat your AI client like a full Machine Learning Operator. Instead of jumping through dashboards or writing complex scripts, your agent handles the whole process conversationally. You can ask it to list every model currently managed by Baseten, check the status of specific deployments, and even run direct predictions using tensor inputs. It's all about keeping your AI workflow contained, whether you’re checking secrets or running inference on a new payload.

It gives you ML-Ops control right inside your chat window. When combined with Vinkius, you get access to this functionality alongside thousands of other services, letting your agent act as the single operational hub for your entire stack.

## Tools

### list_models
Retrieves a list of all machine learning models managed within the Baseten account.

### get_model
Fetches detailed configuration data for one specific Baseten model ID.

### predict
Runs a serverless inference prediction by passing explicit tensor shapes or dictionaries to the deployed model weights.

### list_deployments
Lists all active deployment instances associated with a specific machine learning model.

### get_deployment
Retrieves detailed operational information for a single, running deployment instance.

### list_secrets
Displays all environment secrets configured in the workspace without revealing their actual values.

## Prompt Examples

**Prompt:** 
```
I need to know what models we have deployed right now.
```

**Response:** 
```
Here's the current inventory of managed models:

*   **Llama-3-70B-Instruct:** Fine-tuned for conversational chat. (ID: XYZ)
*   **Defect-Detector-V2:** Computer vision model for quality control. (ID: ABC)

You can also check the deployment status for either of these models to see their current replica counts.
```

**Prompt:** 
```
Can you run a quick test prediction on the Defect Detector model?
```

**Response:** 
```
Sending payload to Defect-Detector-V2... 

The inference pipeline returned successfully. Here is the result mapping:
```json
{
  "score": 0.98,
  "label": "PASS"
}
```
Prediction complete. This was executed against the latest stable deployment version.
```

## Capabilities

### List all deployed models
See a comprehensive list of every ML model currently managed within your Baseten account.

### Retrieve specific model details
Get full configuration information for any individual model ID you specify.

### Run serverless predictions
Execute real-time, low-latency inference by feeding tensor shapes or JSON directly into a deployed model instance.

### Audit active deployment states
List and inspect the current replica counts and autoscaling configurations for specific models.

### Check workspace secrets
Enumerate all active environment variables and secrets stored securely within your isolated ML orchestration space.

## Use Cases

### Verifying model readiness before launch
An ML Engineer needs to confirm if a new version of the Defect-Detector-V2 works. They use their agent to check all active deployments via `list_deployments` and then run a small test payload using `predict`, getting immediate confirmation that the inference is stable.

### Auditing infrastructure compliance
A DevOps engineer needs proof of secure credentials. They ask their agent to list secrets, verifying that the required API keys are present and correctly isolated within the workspace using `list_secrets`.

### Debugging unexpected prediction failures
An AI Researcher notices performance dips. Instead of guessing, they use the agent to pull explicit deployment details via `get_deployment`, identifying if scaling parameters or version mismatches are causing the issue.

### Onboarding a new team member quickly
A manager needs an overview of all assets. They ask their agent to list all managed models using `list_models` and get basic details on each one via `get_model`, providing a complete inventory summary.

## Benefits

- Run live inference tests immediately. Use the `predict` tool to test payloads against deployed models without ever leaving your agent interface.
- Keep track of infrastructure status. The MCP lets you list active deployments and check replica states, so you always know if your model is running correctly.
- Manage complex resources in one place. You can view all managed models using `list_models` and audit their full configurations without switching tabs.
- Maintain security visibility. Use the `list_secrets` tool to confirm that critical environment variables are provisioned securely, without exposing plaintext values.
- Simplify troubleshooting. Instead of digging through logs, you get direct access to deployment details via `get_deployment`, making root cause analysis faster.

## How It Works

The bottom line is that your AI client becomes an integrated ML workflow toolset for Baseten.

1. Subscribe to this MCP on Vinkius and provide your Baseten API key.
2. Give your AI agent a command, like 'What models do we have?'
3. Your agent runs the necessary tool calls and responds with structured data, allowing you to take immediate action.

## Frequently Asked Questions

**How does Baseten MCP help me manage multiple AI models?**
It centralizes your entire ML model inventory. Instead of logging into separate dashboards for each service, you can ask the agent to list all deployed models and check their statuses from one place.

**Can I use Baseten MCP to test my model predictions?**
Yes, that’s a core function. You can run immediate, real-time inference tests by providing specific payloads directly to the deployed models without needing local code setup.

**What if I need to check sensitive API keys or secrets? Does Baseten MCP handle that?**
The MCP lets you list all active workspace secrets. It confirms which credentials are provisioned and accessible for your models without ever showing the actual plaintext values, keeping everything secure.

**Does Baseten MCP help DevOps teams audit my ML infrastructure?**
Absolutely. You can check detailed deployment information, including replica counts and autoscaling configurations, allowing you to verify that your production environment is running exactly as designed.