How to Use the DeepInfra (Serverless LLM Inference) MCP in Claude Code

Q: What is the command to add DeepInfra (Serverless LLM Inference) to Claude Code?

Run claude mcp add --transport http deepinfra -- . Make sure all flags appear before the server name to ensure proper connection.

Run serverless LLMs and image generation straight from your terminal using Claude Code.

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

MCP Servers - Free for Subscribers

Connect DeepInfra (Serverless LLM Inference) MCP to Claude Code

Create your Vinkius account to connect DeepInfra (Serverless LLM Inference) to Claude Code and route execution through our secure gateway. The platform manages server hosting, runtime updates, and security layers. Configuration requires no manual server provisioning.

GDPR Free for Subscribers

Setup DeepInfra (Serverless LLM Inference) with Claude Code

Ask AI about this MCP

ChatGPT

Claude

Perplexity

Analyze CLI Outputs

The `create_chat_completion` tool feeds terminal outputs into external LLMs like DeepSeek-V3. You pipe a massive error log to your CLI agent, and it queries the serverless API to diagnose the stack trace. Working entirely headless means no browser tabs get in your way. The agent reads the JSON response from the model and prints the suggested shell commands directly to your standard output.

Vectorize Files from the Command Line

Calling `create_embedding` turns raw text files into vector data right from your terminal session. You point the agent at a directory of markdown files and tell it to generate embeddings for each one. A DevOps engineer can script a cron job that reads daily system reports, vectorizes them through the API, and pushes the arrays to a search cluster. The entire process runs autonomously in the background.

Access Private Deployments via Claude Code MCP Server

Executing `run_native_inference` gives your Claude Code MCP Server access to specialized endpoints like video generation or private deployments. You pass the model string and payload via standard input. Passing raw data to custom models usually requires writing boilerplate curl requests. Here, the agent handles the headless execution, checks the HTTP status, and parses the resulting data automatically.

Setup guide

Set up DeepInfra (Serverless LLM Inference) MCP in Claude Code

Prerequisites

Claude Code CLI installed (npm install -g @anthropic-ai/claude-code)
Active Vinkius subscription with a valid endpoint token

1

Run the add command

Open your terminal and run the command shown on the right. Replace [YOUR_TOKEN_HERE] with your endpoint token from cloud.vinkius.com. Use --scope user to make it available across all projects.
2

Verify the connection

Start a Claude Code session and type /mcp to list connected servers. You should see deepinfra-serverless-llm-inference-mcp with a green status indicator.
3

Start using tools

Ask Claude Code something like "Check my latest DeepInfra (Serverless LLM Inference) transactions." It will automatically discover and invoke the available DeepInfra (Serverless LLM Inference) tools.

Terminal

claude mcp add --transport http deepinfra-serverless-llm-inference-mcp https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Get your connection token →

Prerequisites

Claude Code CLI installed
Active Vinkius subscription with a valid endpoint token

1

Open the config file

Create or edit .mcp.json in your project root for project-level scope, or ~/.claude.json for user-level scope.
2

Add the DeepInfra (Serverless LLM Inference) MCP

Paste the JSON snippet shown on the right into the mcpServers object. Replace [YOUR_TOKEN_HERE] with your endpoint token from cloud.vinkius.com.
3

Restart Claude Code

Start a new Claude Code session. Type /mcp to confirm the server is connected. The tools will be automatically available in your conversation.

.mcp.json

{
  "mcpServers": {
    "deepinfra-serverless-llm-inference-mcp": {
      "type": "url",
      "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
    }
  }
}

Get your connection token →

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by DeepInfra. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

Why Choose Vinkius

Vinkius connects your tools to AI with real-time monitoring and automatic cost savings — all from one dashboard.

Connect DeepInfra (Serverless LLM Inference) now

Real-time monitoring

Live

visibility into every interaction

Connect your favorite tools to your AI and see exactly what's happening — every request, every response, in real time.

Built-in savings

60%

lower AI costs

Vinkius compresses data between your apps and your AI automatically. Lower bills every month — no configuration required.

Single dashboard

One

place for every integration

Every tool your AI connects to, managed from a single screen. One account, complete control.

Common questions about DeepInfra (Serverless LLM Inference) MCP in Claude Code

Run `claude mcp add --transport http deepinfra -- `. Make sure all flags appear before the server name to ensure proper connection.

The headless nature of the CLI means you can trigger inference tasks during automated builds. Scripts can query models or generate assets without human intervention.

You provide the exact model path, such as deepseek-ai/DeepSeek-V3, in your command prompt. The agent formats the request and routes it to the correct endpoint.

The agent receives the image data and saves it directly to your specified directory. You check the file system to verify the output.

System logs and command outputs pass through a zero-trust Vinkius proxy. The environment destroys the isolated container the millisecond the API returns a response, ensuring your infrastructure data remains private.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript