Modal MCP for AI. Control your serverless AI infrastructure via chat.

Q: How do I find out what apps are running using listapps?

Run listapps. This gives you a clear rundown of all active and historical Modal app contexts. It's the first place to check if something is running unexpectedly.

Q: Can I stop an app with stopapp using just the name?

No, stopapp requires the specific App ID. You must use listapps or getapp first to get the exact identifier before you can terminate it.

Q: What is the difference between listvolumes and getting app details?

listvolumes shows all disk volumes attached to your account. getapp provides state-specific information about a single running application, which might reference those volumes.

Q: Do I need listsecrets if I just want to check my app?

You should run listsecrets whenever you suspect credential issues. It gives an audit trail of every secret dictionary reference attached to your services.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

Modal MCP Server connects your AI agent directly to a high-performance serverless compute backend. It lets you audit active apps, check GPU deployments, and track persistent storage volumes using natural conversation.

Need to manage complex ML infrastructure without touching the CLI? This is it.

What your AI can do

List apps

Lists all active and historical Modal app contexts currently running or stopped.

Get app

Pulls specific details for one Modal App ID you provide.

Stop app

Forces the immediate termination of an active Modal App execution using its ID.

+ 4 more capabilities included

List active/historical apps

Checks the status and context of all running Modal application instances.

Stop a live app execution

Terminates an actively running Modal App using its ID, preventing further billing charges.

List managed deployments

Retrieves a list of all promoted, long-running service deployments and their endpoint details.

View persistent volumes

Lists the disk network block volumes attached to your Modal account for storage visibility.

Audit secrets

Retrieves a list of secret dictionary references and associated environment variable mappings.

Get specific resource state

Pulls detailed JSON metadata for any single App or Deployment ID you reference.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

Modal (Serverless AI Infrastructure) MCP Server: 7 Tools

These tools let you audit app status, manage deployments, and control resources on your Modal platform using structured function calls.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Modal (Serverless AI Infrastructure) on Vinkius

List Apps

Lists all active and historical Modal app contexts currently running or stopped.

Get App

Pulls specific details for one Modal App ID you provide.

Stop App

Forces the immediate termination of an active Modal App execution using its ID.

List Secrets

Lists every configured secret dictionary reference in your account for auditing...

List Volumes

Shows a list of all persistent disk network block volumes attached to your project.

List Deployments

Provides a list of all actively managed, promoted service deployments on the platform.

Get Deployment

Retrieves detailed status information for a single tracked deployment.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Modal integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "modal-serverless-ai-infrastructure": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Modal tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"modal-serverless-ai-infrastructure": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Modal (Serverless AI Infrastructure), then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Modal. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 7 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Checking infrastructure status shouldn't require five different dashboards and three tabs of copy-pasting.

Today, checking if a job finished or what resources were used means logging into the platform, finding the App ID in one tab, opening the billing dashboard in another, and manually cross-referencing volumes on a third. It's slow, it's error-prone, and you always feel like you missed something.

With this MCP server, your agent handles all that complexity. You just ask: 'What's the status of my latest model run?' The agent runs `list_apps`, pulls details using `get_app`, and gives you a single, actionable answer right in the chat window.

The Modal MCP Server makes resource control simple.

Before, if an experimental job ran wild, you had to find its specific App ID and then navigate a separate console page just to kill it. This was often confusing, leading to over-billing or missed endpoints.

Now, you tell your agent: 'Kill the app with ID ap-123.' It runs `stop_app` instantly. The job stops, the billing cycle ends, and you get confirmation—no clicking required.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

Modal MCP Server - Audit GPU & Compute Infrastructure

Look, you gotta manage complex ML infrastructure without opening a terminal and running twenty lines of ugly CLI commands. This server connects your AI client straight to Modal's backend. It lets your agent handle the heavy lifting: checking active apps, auditing GPU deployments, and tracking persistent storage volumes—all just by talking to it.

You'll get full control over high-performance, serverless compute resources through natural conversation. Here’s what you can do with these tools:

Checking Status: You can use list_apps to pull a rundown of every Modal application context—whether it’s running right now or if it just finished up. Need to know where your services are? Use list_deployments to get a list of all promoted, long-running service deployments on the platform.
Storage and Security: You gotta see what's attached to your account, so you can use list_volumes to show every persistent disk network block volume connected to your project. For security checks, run list_secrets; it gives you a list of all configured secret dictionary references in your account.
Deep Dives: If you need the nitty-gritty on one specific thing, you've got two ways to drill down. Use get_app and give it an App ID; it pulls precise JSON metadata about that single running application instance. Same deal with deployments: use get_deployment and reference a Deployment ID to get its detailed status information.
Controlling the Compute: Sometimes you gotta hit the kill switch. If an app’s running too long and racking up bills, use stop_app. You feed it the App ID, and it forces the immediate termination of that active Modal App execution, stopping billing charges right away.

Basically, if you need to audit resources or check a status without typing out a single command, this is your ride. It gives you visibility into what's running, where it's stored, and lets you shut down runaway processes instantly.

Built · Hosted · Managed by Vinkius Modal MCP Server - Audit GPU & Compute Infrastructure

Server ID 019d75d6-a79a-70ea-9c23-361273b417a7

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Who is this actually for?

This is for ML Engineers and DevOps folks who are sick of spending hours clicking through multiple dashboards or running verbose modal commands just to check if a GPU job failed. If your job involves managing stateful, high-cost cloud compute resources, you need this.

ML Engineer

Checks the status of training jobs using list_apps and verifies endpoint details with get_deployment before merging code.

DevOps SRE

Manages resource cleanup by running stop_app when a test run is finished, or audits credentials using list_secrets.

Data Scientist

Checks persistent data integrity by listing volumes with list_volumes, ensuring datasets are mounted correctly before training begins.

What Changes When You Connect

Stop unexpected bills immediately. Instead of logging into the console to find a rogue job, use stop_app to force-terminate an active App execution with a single command.

Keep track of deployed services without manual lookups. Use list_deployments to get all web endpoints and serving configurations in one shot.

Audit your data security instantly. Running list_secrets lets you verify every stored secret reference, which is crucial before deploying sensitive models.

Manage massive datasets easily. list_volumes shows all persistent disk volumes, letting you know exactly where your training data lives across the cluster.

Get deep state info on demand. Use get_app or get_deployment to pull precise JSON metadata for any resource ID, bypassing vague status messages.

See everything at once. Running list_apps gives a clear snapshot of all running and historical compute contexts.

See it in action

01 01

The runaway GPU job

A data scientist kicks off an experimental model training run, forgets to monitor it, and gets hit with a massive bill. They ask their agent: 'What apps are running?' The agent uses list_apps to identify the rogue App ID, which the user then passes to stop_app, stopping the billing cycle immediately.

02 02

Checking pre-launch readiness

A DevOps engineer is deploying a new microservice. Before merging, they use their agent: 'List all deployments and check the credentials.' The agent runs list_deployments to verify endpoints and then list_secrets to confirm necessary keys are available.

03 03

Tracing dataset location

A new ML engineer needs to know where the historical data lives. They ask: 'Show me all stored datasets.' The agent runs list_volumes, providing a list of named persistent disks, allowing the user to confirm which volume ID holds the source files.

04 04

Debugging an app failure

An AI engineer finds that a specific application (app-xyz) is failing. They ask: 'What's wrong with this app?' The agent runs get_app and returns the full JSON metadata, letting the user pinpoint if the issue is resource allocation or configuration.

The honest tradeoffs

Treating it like a simple database query

Anti-pattern

The user assumes they can just ask, 'Give me all my app data.' The agent fails because the system needs explicit context about what state they want (running vs. historical).

The Fix

You need to guide your agent's scope. Start by running list_apps to see the available IDs, then use get_app [ID] for specific data points.

Over-relying on a single list call

Anti-pattern

The user runs list_volumes and sees 10 volumes. They assume this means all data is fine, but they don't know which volume holds the active model weights.

The Fix

Always cross-reference storage with deployment details. Use list_deployments first, then check the associated resource metadata via get_deployment [ID] to pinpoint critical volumes.

Manually tracking state changes

Anti-pattern

The user sees an app is running and waits for it to finish, wasting time and accruing unnecessary costs while monitoring a dashboard.

The Fix

If you don't need the job to run, use stop_app [ID] immediately. This handles graceful termination and prevents billing cycles.

When It Fits, When It Doesn't

Use this server if your infrastructure relies on complex resource management: GPU clusters, persistent storage (volumes), or ephemeral, high-cost compute runs. You need to monitor the lifecycle of applications and deployments over time.

Don't use it if you just need simple data—like checking a static API key or reading user profiles. If your needs are limited to basic CRUD operations on non-compute resources, a standard database connector is better.

If you have an active app that costs money and isn't running, you MUST run stop_app to save cash. If you need to know why it failed, use get_app [ID] for the full metadata dump. This tool is your primary operational check-up.

Questions you might have

How do I find out what apps are running using list_apps? +

Run list_apps. This gives you a clear rundown of all active and historical Modal app contexts. It's the first place to check if something is running unexpectedly.

Can I stop an app with stop_app using just the name? +

No, stop_app requires the specific App ID. You must use list_apps or get_app first to get the exact identifier before you can terminate it.

What is the difference between list_volumes and getting app details? +

list_volumes shows all disk volumes attached to your account. get_app provides state-specific information about a single running application, which might reference those volumes.

Do I need list_secrets if I just want to check my app? +

You should run list_secrets whenever you suspect credential issues. It gives an audit trail of every secret dictionary reference attached to your services.

What happens if I use list_secrets with expired or wrong credentials? +

The server immediately fails and returns a specific authentication error code. You must ensure your Modal Token ID and Secret are current before running this tool.

If I run stop_app on an App ID that is already terminated, will it cause an error? +

No, the system handles this gracefully. It returns a status message confirming the app is already inactive and takes no action, preventing unnecessary billing cycles.

Does list_volumes show real-time network usage or just static storage details? +

It only lists the persistent disk block volumes. You get information on size and mount paths for the connected data stores, not active bandwidth metrics.

What if I try to use get_deployment with an ID that was never promoted? +

The tool returns a clear error stating that no tracked deployment exists for that specific ID. This confirms whether the resource is managed by Modal's promotion system.

Can I stop a running Modal app through my agent to save costs? +

Yes. Use the stop_app tool with an active App ID. Your agent will dispatch a termination command to Modal, gracefully stopping the serverless container spin-up and preventing further billing for that specific execution.

How do I check which web endpoints are active for my deployments? +

The list_deployments and get_deployment tools retrieve the Promoted image data. Your agent will expose the public URL endpoints and serving metadata associated with your long-running Modal deployments.

Can my agent audit the secrets and persistent volumes in my workspace? +

Absolutely. Use the list_secrets and list_volumes tools to monitor your infrastructure assets. Your agent will report the names and references for your stored secrets and network block storage mounts attached to your compute instances.

Connect to your AI in seconds.

List apps

Get app

Stop app

Modal (Serverless AI Infrastructure) MCP Server: 7 Tools

Make your AI actually useful.

List Apps

Get App

Stop App

List Secrets

List Volumes

List Deployments

Get Deployment

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Checking infrastructure status shouldn't require five different dashboards and three tabs of copy-pasting.

The Modal MCP Server makes resource control simple.

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

The runaway GPU job

Checking pre-launch readiness

Tracing dataset location

Debugging an app failure

The honest tradeoffs

Treating it like a simple database query

Over-relying on a single list call

Manually tracking state changes

When It Fits, When It Doesn't

Questions you might have