How to Use the NVIDIA API Catalog MCP in Claude

Q: How do I configure the NVIDIA API Catalog MCP Server in Claude Desktop?

Add the server configuration to your local claudedesktopconfig.json file. Provide your NVIDIA API key as an environment variable, then restart the Claude Desktop app to initialize the tools.

Q: Can I run Llama-Vision models via the NVIDIA API Catalog in Claude Desktop?

Yes, Claude Desktop uses the nvidiavisioninference tool to pass image payloads directly to NVIDIA's hosted vision models and displays the analysis in your chat.

Q: How does Claude Desktop handle NVIDIA API Catalog token limits?

The client uses nvidiachecktokenquota to query your remaining API credits. You can ask Claude to check your balance anytime to prevent unexpected inference interruptions.

Q: Can I use custom LoRA adapters with this integration?

Yes. The server exposes the nvidialistloraadapters tool, which lets Claude Desktop discover and apply your fine-tuned overrides directly to active chat sessions.

Q: Is my image and text data safe when using this server?

All your prompt text, images, and embeddings processed by nvidiachatcompletion go directly to NVIDIA's API endpoints. Vinkius runs the MCP server in an isolated sandbox, meaning your raw inputs are never logged or stored locally on our platform.

Get raw NVIDIA API Catalog model power straight inside Claude Desktop to run heavy LLM inference without writing code.

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

MCP Servers - Free for Subscribers

Connect NVIDIA API Catalog MCP to Claude Desktop

Create your Vinkius account to connect NVIDIA API Catalog to Claude Desktop and route execution through our secure gateway. The platform manages server hosting, runtime updates, and security layers. Configuration requires no manual server provisioning.

GDPR Free for Subscribers

Setup NVIDIA API Catalog with Claude Desktop

Ask AI about this MCP

ChatGPT

Claude

Perplexity

Run Llama3 and Nemotron Models in Claude Desktop

The NVIDIA API Catalog MCP server connects Claude Desktop directly to NVIDIA's hosted models through `nvidia_chat_completion` and `nvidia_list_foundation_models`. You get immediate access to Nemotron-4 and Llama-3-70b-Instruct right within your Claude Desktop chat sidebar, bypassing the usual API setup. Just ask Claude to run a prompt against a specific NVIDIA model path. The Claude Desktop client calls the tool, executes the inference on NVIDIA's GPU cloud, and drops the output straight into your current workspace.

Analyze Images Natively with Llama-Vision

The `nvidia_vision_inference` tool enables Claude Desktop to analyze image data using NVIDIA's hosted vision models. Drag a mockup or UI screenshot directly into your Claude chat to identify structural layout issues without running local PyTorch environments. Claude parses the visual layout, matches the elements against your design requirements, and writes the corrected layout code directly back into your chat history.

Track NVIDIA Token Quotas and Cloud Status

The `nvidia_check_token_quota` tool monitors your active developer account balance directly inside Claude Desktop via this MCP integration. You will know exactly when you are running low on credits before starting a massive inference run. If model responses seem sluggish, ask Claude to check endpoints using `nvidia_get_cloud_status`. It pings the NVIDIA endpoints to report real-time latencies right in your sidebar.

Setup guide

Set up NVIDIA API Catalog MCP in Claude Web or Desktop

1

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.
2

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL: https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.
3

Start a conversation

Open a new chat. The NVIDIA API Catalog MCP tools are available immediately — no restart needed.

Endpoint URL

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

No configuration file needed — paste the URL directly in the Claude web interface.

Available on Free (1 connector), Pro, Max, Team, and Enterprise plans.

Prerequisites

Claude Desktop installed (macOS or Windows)
Active Vinkius subscription with a valid endpoint token

1

Open Claude Desktop Settings

Click the menu icon at the top-left corner, go to Settings → Developer → Edit Config. This opens claude_desktop_config.json in your default text editor.
2

Paste the NVIDIA API Catalog MCP configuration

Copy the JSON snippet on the right into the mcpServers object. Replace [YOUR_TOKEN_HERE] with your endpoint token from cloud.vinkius.com.
3

Restart Claude Desktop

Close and reopen the application. Claude needs a full restart to load new MCPs — refreshing a conversation is not enough.
4

Verify the connection

Open a new conversation. Click the 🔌 icon at the bottom of the message input. You should see tools listed under nvidia-api-catalog-mcp.

json

{
  "mcpServers": {
    "nvidia-api-catalog-mcp": {
      "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
    }
  }
}

Get your connection token →

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by NVIDIA API Catalog. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

Why Choose Vinkius

Vinkius connects your tools to AI with real-time monitoring and automatic cost savings — all from one dashboard.

Connect NVIDIA API Catalog now

Real-time monitoring

Live

visibility into every interaction

Connect your favorite tools to your AI and see exactly what's happening — every request, every response, in real time.

Built-in savings

60%

lower AI costs

Vinkius compresses data between your apps and your AI automatically. Lower bills every month — no configuration required.

Single dashboard

One

place for every integration

Every tool your AI connects to, managed from a single screen. One account, complete control.

Common questions about NVIDIA API Catalog MCP in Claude Desktop

Add the server configuration to your local `claude_desktop_config.json` file. Provide your NVIDIA API key as an environment variable, then restart the Claude Desktop app to initialize the tools.

Yes, Claude Desktop uses the `nvidia_vision_inference` tool to pass image payloads directly to NVIDIA's hosted vision models and displays the analysis in your chat.

The client uses `nvidia_check_token_quota` to query your remaining API credits. You can ask Claude to check your balance anytime to prevent unexpected inference interruptions.

Yes. The server exposes the `nvidia_list_lora_adapters` tool, which lets Claude Desktop discover and apply your fine-tuned overrides directly to active chat sessions.

All your prompt text, images, and embeddings processed by `nvidia_chat_completion` go directly to NVIDIA's API endpoints. Vinkius runs the MCP server in an isolated sandbox, meaning your raw inputs are never logged or stored locally on our platform.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript