TrueFoundry MCP for AI. Route models and manage AI deployments from one hub.

Q: How do I check which LLMs are available using truefoundrylistgatewaymodels?

Running truefoundrylistgatewaymodels returns a dynamic list of all foundation models currently supported by the gateway. This tells you exactly what's accessible without needing to check vendor documentation.

Q: Is TrueFoundry better than using direct API calls for embeddings?

Yes, it is cleaner. Using truefoundrygenerateembeddings abstracts away the provider details. You send the text once and get a vector back, without worrying if you're hitting an OpenAI or Gemini rate limit.

Q: What should I use to monitor my deployed AI containers?

You use truefoundrygetdeploymentstatus for detailed metrics on specific instances. If you want a list of all running services, run truefoundrylistdeployments.

Q: Can I discover the schema of a custom tool before deploying it?

Yes. Use truefoundrygetmcpserverinfo. This retrieves the exact JSON metadata for any registered MCP tool, so you know precisely what inputs and outputs to expect.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

TrueFoundry is an LLM Gateway and ML deployment hub. It manages connections to over 1,000 proxy models (OpenAI, Anthropic, Gemini, Llama, etc.) and lets you run AI services without managing dozens of individual APIs or keys.

You deploy custom MCP containers, route chats securely, and monitor the entire system from one place.

What your AI can do

Truefoundry deploy mcp server

Spawns a new backend container logical process using TrueFoundry's service mesh for custom tools.

Truefoundry generate embeddings

Calculates semantic vectors securely using the unified abstraction layer for text data.

Truefoundry get deployment status

Provides detailed metrics and health status on all running AI deployment containers.

+ 5 more capabilities included

Route Queries Across LLMs

Your agent sends one chat request to the gateway, which then routes it securely to multiple supported models like OpenAI, Anthropic, or Gemini.

Deploy Custom AI Services

You deploy new backend containers (MCP servers) directly onto the infrastructure limits and manage them as isolated services.

Calculate Semantic Vectors

The system generates embedding vectors for any string using a secure, unified abstraction layer.

List Available Models

You pull an exact list of all foundation models currently accessible through the TrueFoundry gateway.

Monitor Service Status

The agent retrieves detailed metrics and status reports for every deployed AI service instance.

Introspect Tool Schemas

You pull the exact JSON metadata (schema) for any registered MCP tool, helping you understand what parameters it needs.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

TrueFoundry MCP Server: 8 Tools for Model & Deployment Ops

This server gives you the tools to manage, route, and deploy every aspect of your LLM infrastructure using a single unified gateway.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using TrueFoundry on Vinkius

Truefoundry Deploy Mcp Server

Spawns a new backend container logical process using TrueFoundry's service mesh for custom tools.

Truefoundry Generate Embeddings

Calculates semantic vectors securely using the unified abstraction layer for text...

Truefoundry Get Deployment Status

Provides detailed metrics and health status on all running AI deployment containers.

Truefoundry Get Mcp Server Info

Extracts the precise JSON metadata and schema for any registered TrueFoundry tool.

Truefoundry List Deployments

Monitors the current array of running backend topologies mapped to your team's...

Truefoundry List Gateway Models

Lists all foundation models that are currently accessible via the TrueFoundry unified AI gateway.

Truefoundry List Mcp Servers

Retrieves a full registry map of every available logical MCP Tool within TrueFoundry.

Truefoundry Run Gateway Chat

Executes an inference chat stream by pushing model queries through the unified...

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The TrueFoundry integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "truefoundry": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the TrueFoundry tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"truefoundry": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with TrueFoundry, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by TrueFoundry. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 8 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Managing 10+ separate AI API integrations feels like a full-time job.

Today, if you want an agent to answer questions using both OpenAI's general knowledge and Anthropic's specialized writing style, you write complex code. You handle two sets of keys, two different request bodies, and you build in logic to decide which API call runs when—and what happens if one times out.

With TrueFoundry, the process is simple. Your agent just sends a prompt through the gateway. The system handles the routing between models like Claude and OpenAI internally. You get a single, unified chat stream, no matter how many underlying providers you use.

TrueFoundry MCP Server: Control your entire ML deployment lifecycle.

Without this gateway, deploying a new microservice means managing the networking, authentication, and resource limits for that specific container on top of your existing stack. You have to manually track if it's running or if its dependencies broke.

Using `truefoundry_deploy_mcp_server` makes deployment native. The service registers itself with TrueFoundry, giving you immediate visibility via `truefoundry_list_deployments`, and the system handles resource isolation automatically.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

TrueFoundry acts as your central hub for AI services, managing connections to over a thousand proxy models like OpenAI, Anthropic, Gemini, and Llama. You connect your agent here; we handle routing and deployment management across every model provider. This server lets you run complex AI processes without needing separate API keys or container setups for each service.

When you're ready to chat, use truefoundry_run_gateway_chat to execute an inference stream. Your agent pushes one query through the unified gateway endpoint, and the system handles routing that request securely across multiple supported LLMs. Before running a chat, check what models are available by calling truefoundry_list_gateway_models; this pulls an exact list of every foundation model currently accessible via the gateway.

Need to know which custom tools your team has set up? You can retrieve a full registry map by using truefoundry_list_mcp_servers. To see what's running right now, run truefoundry_list_deployments, and that monitors the current array of backend topologies mapped to your account. For the deepest level of visibility, you get detailed metrics and health reports for every deployed AI service instance using truefoundry_get_deployment_status.

You can also check what services are running by calling truefoundry_list_deployments.

When you need to build a custom service, you deploy new backend containers—or MCP servers—directly onto the infrastructure limits. Use truefoundry_deploy_mcp_server to spawn these logical processes via TrueFoundry's service mesh for your unique tools. If you wanna know what parameters that tool needs when it runs, call truefoundry_get_mcp_server_info, and that pulls the precise JSON metadata schema for any registered MCP tool.

For data processing, use truefoundry_generate_embeddings. This function calculates semantic vectors for any string you provide using a secure, unified abstraction layer. The system manages this process regardless of which model provider's embedding capabilities it uses internally.

This hub lets you manage everything in one place. You pull the full registry map with truefoundry_list_mcp_servers and get specific info about each container using truefoundry_get_mcp_server_info. It’s your single point of control for running, monitoring, and developing against a vast array of LLMs.

Built · Hosted · Managed by Vinkius TrueFoundry MCP Server - LLM Gateway & Deployment Hub

Server ID 019d7616-4f6e-7239-bc09-2265cc645c48

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Here's how it actually works

The bottom line is: it gives your agent a single brain that can talk to dozens of specialized AI APIs without needing separate credentials or complex routing logic in your code.

First, get your TrueFoundry Personal Access Token and identify your dedicated cluster URL.

Next, configure your agent to connect through this single endpoint. When the agent runs a task (like chat or embedding generation), TrueFoundry handles routing and isolation behind the scenes.

Finally, you use specific tools—like truefoundry_list_mcp_servers—to discover all available services and manage their lifecycle.

Who is this actually for?

This hub is essential for Platform Operations teams, AI Engineers, and Software Architects. If you spend more time managing API keys, dealing with vendor rate limits, or stitching together multiple service endpoints than actually building features, this is for you. You need to stop treating every LLM provider like a separate project.

Platform Operations Engineer

Manages the central connection point for all AI services. They use truefoundry_list_mcp_servers and truefoundry_get_deployment_status to maintain visibility across hundreds of running tools.

AI Software Architect

Designs multi-model pipelines, deciding which LLM (Gemini, GPT, etc.) should handle a specific part of the workflow. They use truefoundry_list_gateway_models to map out options.

ML Deployment Engineer

Deploys and governs custom AI tools by invoking truefoundry_deploy_mcp_server, ensuring new services run cleanly within the managed cluster.

What Changes When You Connect

Centralized Model Routing: Don't juggle multiple vendor APIs. When you use truefoundry_run_gateway_chat, your agent sends a query once, and TrueFoundry routes it to the best model (OpenAI, Anthropic, etc.) automatically.

Full Lifecycle Visibility: Need to know if your custom tool is running? Use truefoundry_list_deployments or truefoundry_get_deployment_status. You get a single dashboard view of every deployed MCP container.

Schema Discovery on Demand: Don't guess what tools exist. Call truefoundry_list_mcp_servers to see the entire registry, and use truefoundry_get_mcp_server_info to pull the exact JSON schema for any specific tool you plan to call.

Dedicated Embedding Service: Calculating vectors doesn't require a separate API key. Use truefoundry_generate_embeddings to keep your data processing clean and abstracted from the underlying model providers.

Zero-Friction Model Discovery: Before coding, run truefoundry_list_gateway_models. This instantly tells you exactly which foundation models are available through the gateway without hitting any vendor documentation first.

See it in action

01 01

Building a multi-model QA system

A data scientist needs to build an agent that answers questions using both general knowledge (GPT) and highly specialized, internal documents (Llama). Instead of writing two separate endpoints, they use TrueFoundry. They run truefoundry_list_gateway_models to confirm availability, then configure the chat flow: the initial query hits truefoundry_run_gateway_chat, routing through the necessary model mix, solving the multi-API headache instantly.

02 02

Monitoring a complex microservice mesh

An Ops team is running five different customer-facing AI tools (e.g., summarizer, classifier, chatbot). They don't want to log into five dashboards. They use truefoundry_list_deployments and truefoundry_get_deployment_status to pull all resource usage and health metrics for every MCP container in one place, solving the 'where did my service die?' problem.

03 03

Integrating a new internal tool

A development team builds a niche inventory management AI (an MCP server). They don't know how to expose it safely. Using truefoundry_deploy_mcp_server allows them to deploy the container directly into the TrueFoundry mesh, instantly making it discoverable and callable by other services via its defined schema.

04 04

Pre-flight check for new features

An architect needs to know what kind of embeddings are possible before writing any code. They run truefoundry_generate_embeddings with test text, confirming the vector format and security boundaries using the unified abstraction layer, avoiding time spent testing against a single vendor's specific API limits.

The honest tradeoffs

Using chat for infrastructure checks

Anti-pattern

Trying to ask your agent, 'What are my deployed services?' through truefoundry_run_gateway_chat.

The Fix

Chat is for conversation. For service status and deployment lists, always use the dedicated tools: truefoundry_list_deployments or truefoundry_get_deployment_status. They give you structured metrics, not conversational answers.

Hardcoding API keys

Anti-pattern

Writing code that calls OpenAI's endpoint directly and then calling Gemini's endpoint separately.

The Fix

Don't hardcode. Connect your agent to the TrueFoundry gateway. It handles routing for you, so you only need one unified credential set.

Assuming model availability

Anti-pattern

Writing code that fails because it calls a model name that was deprecated or rate-limited on a single vendor's platform.

The Fix

First, check what's available. Run truefoundry_list_gateway_models to get the definitive list of active models in your environment before you write a line of code.

When It Fits, When It Doesn't

Use TrueFoundry if your project involves running multiple distinct AI services or LLMs, and operational visibility is critical. You need this hub when you must balance feature velocity (rapidly connecting new APIs) against operational governance (knowing exactly what's deployed and how it runs). Specifically, if you plan to deploy custom MCP containers (truefoundry_deploy_mcp_server) or if your chat flow needs to switch between model providers—this is required.

Don't use this hub if all you need is a simple single-purpose API call (e.g., 'I just need GPT-4o embeddings, nothing else'). In that case, calling the vendor directly might be simpler. However, even for single tasks like embedding generation, using truefoundry_generate_embeddings keeps your system standardized and isolated, which is usually worth the minor overhead.

Questions you might have

How do I check which LLMs are available using truefoundry_list_gateway_models? +

Running truefoundry_list_gateway_models returns a dynamic list of all foundation models currently supported by the gateway. This tells you exactly what's accessible without needing to check vendor documentation.

Is TrueFoundry better than using direct API calls for embeddings? +

Yes, it is cleaner. Using truefoundry_generate_embeddings abstracts away the provider details. You send the text once and get a vector back, without worrying if you're hitting an OpenAI or Gemini rate limit.

What should I use to monitor my deployed AI containers? +

You use truefoundry_get_deployment_status for detailed metrics on specific instances. If you want a list of all running services, run truefoundry_list_deployments.

Can I discover the schema of a custom tool before deploying it? +

Yes. Use truefoundry_get_mcp_server_info. This retrieves the exact JSON metadata for any registered MCP tool, so you know precisely what inputs and outputs to expect.

How do I authenticate before running a chat query using truefoundry_run_gateway_chat? +

You must first generate your TrueFoundry credentials token from the settings. This token secures all calls, isolating your original vendor keys completely so you don't need to manage multiple API secrets.

What metrics does truefoundry_get_deployment_status provide regarding usage caps? +

It provides detailed metric states on the orchestration matrix bounds. You can track specific resource utilization, including status and isolation limits for active deployments.

Where can I see a registry of all available MCP Tools using truefoundry_list_mcp_servers? +

The command extracts the full registry mapping of every logical MCP Tool. This lets you audit all services that TrueFoundry supports without needing to know their endpoints.

How are my documents kept private when I calculate embeddings using truefoundry_generate_embeddings? +

The calculation happens via a unified abstraction layer, ensuring secure data transmission. Your original source vendor APIs never touch your codebase or the raw input strings you pass in.

Can I route conversational streams directly via the AI agent using the Universal Gateway? +

Yes! You can orchestrate inferences parsing run_gateway_chat providing dedicated string formats mapping natively any enabled model.

Is it possible to monitor crashed services or container states? +

Absolutely. Target the instance ID and emit get_deployment_status explicitly bounding execution limits and fetching live log matrices.

Are the deployment configuration variables isolated upon server launch? +

Yes, using deploy_mcp_server dynamically provisions encapsulated boundaries. You stringify environment tokens seamlessly obscuring values into active runtimes only.

Connect to your AI in seconds.

Truefoundry deploy mcp server

Truefoundry generate embeddings

Truefoundry get deployment status

TrueFoundry MCP Server: 8 Tools for Model & Deployment Ops

Make your AI actually useful.

Truefoundry Deploy Mcp Server

Truefoundry Generate Embeddings

Truefoundry Get Deployment Status

Truefoundry Get Mcp Server Info

Truefoundry List Deployments

Truefoundry List Gateway Models

Truefoundry List Mcp Servers

Truefoundry Run Gateway Chat

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Managing 10+ separate AI API integrations feels like a full-time job.

TrueFoundry MCP Server: Control your entire ML deployment lifecycle.

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Building a multi-model QA system

Monitoring a complex microservice mesh

Integrating a new internal tool

Pre-flight check for new features

The honest tradeoffs

Using chat for infrastructure checks

Hardcoding API keys

Assuming model availability

When It Fits, When It Doesn't

Questions you might have