TrueFoundry MCP. Route models and manage AI deployments from one hub.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
TrueFoundry is an LLM Gateway and ML deployment hub. It manages connections to over 1,000 proxy models (OpenAI, Anthropic, Gemini, Llama, etc.) and lets you run AI services without managing dozens of individual APIs or keys.
You deploy custom MCP containers, route chats securely, and monitor the entire system from one place.
What your AI agents can do
Truefoundry deploy mcp server
Spawns a new backend container logical process using TrueFoundry's service mesh for custom tools.
Truefoundry generate embeddings
Calculates semantic vectors securely using the unified abstraction layer for text data.
Truefoundry get deployment status
Provides detailed metrics and health status on all running AI deployment containers.
Your agent sends one chat request to the gateway, which then routes it securely to multiple supported models like OpenAI, Anthropic, or Gemini.
You deploy new backend containers (MCP servers) directly onto the infrastructure limits and manage them as isolated services.
The system generates embedding vectors for any string using a secure, unified abstraction layer.
You pull an exact list of all foundation models currently accessible through the TrueFoundry gateway.
The agent retrieves detailed metrics and status reports for every deployed AI service instance.
You pull the exact JSON metadata (schema) for any registered MCP tool, helping you understand what parameters it needs.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
TrueFoundry MCP Server: 8 Tools for Model & Deployment Ops
This server gives you the tools to manage, route, and deploy every aspect of your LLM infrastructure using a single unified gateway.
019d7616truefoundry deploy mcp server
Spawns a new backend container logical process using TrueFoundry's service mesh for custom tools.
019d7616truefoundry generate embeddings
Calculates semantic vectors securely using the unified abstraction layer for text data.
019d7616truefoundry get deployment status
Provides detailed metrics and health status on all running AI deployment containers.
019d7616truefoundry get mcp server info
Extracts the precise JSON metadata and schema for any registered TrueFoundry tool.
019d7616truefoundry list deployments
Monitors the current array of running backend topologies mapped to your team's account.
019d7616truefoundry list gateway models
Lists all foundation models that are currently accessible via the TrueFoundry unified AI gateway.
019d7616truefoundry list mcp servers
Retrieves a full registry map of every available logical MCP Tool within TrueFoundry.
019d7616truefoundry run gateway chat
Executes an inference chat stream by pushing model queries through the unified gateway endpoint.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with TrueFoundry, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
TrueFoundry acts as your central hub for AI services, managing connections to over a thousand proxy models like OpenAI, Anthropic, Gemini, and Llama. You connect your agent here; we handle routing and deployment management across every model provider. This server lets you run complex AI processes without needing separate API keys or container setups for each service.
When you're ready to chat, use truefoundry_run_gateway_chat to execute an inference stream. Your agent pushes one query through the unified gateway endpoint, and the system handles routing that request securely across multiple supported LLMs. Before running a chat, check what models are available by calling truefoundry_list_gateway_models; this pulls an exact list of every foundation model currently accessible via the gateway.
Need to know which custom tools your team has set up? You can retrieve a full registry map by using truefoundry_list_mcp_servers. To see what's running right now, run truefoundry_list_deployments, and that monitors the current array of backend topologies mapped to your account. For the deepest level of visibility, you get detailed metrics and health reports for every deployed AI service instance using truefoundry_get_deployment_status.
You can also check what services are running by calling truefoundry_list_deployments.
When you need to build a custom service, you deploy new backend containers—or MCP servers—directly onto the infrastructure limits. Use truefoundry_deploy_mcp_server to spawn these logical processes via TrueFoundry's service mesh for your unique tools. If you wanna know what parameters that tool needs when it runs, call truefoundry_get_mcp_server_info, and that pulls the precise JSON metadata schema for any registered MCP tool.
For data processing, use truefoundry_generate_embeddings. This function calculates semantic vectors for any string you provide using a secure, unified abstraction layer. The system manages this process regardless of which model provider's embedding capabilities it uses internally.
This hub lets you manage everything in one place. You pull the full registry map with truefoundry_list_mcp_servers and get specific info about each container using truefoundry_get_mcp_server_info. It’s your single point of control for running, monitoring, and developing against a vast array of LLMs.
How TrueFoundry MCP Works
- 1 First, get your TrueFoundry Personal Access Token and identify your dedicated cluster URL.
- 2 Next, configure your agent to connect through this single endpoint. When the agent runs a task (like chat or embedding generation), TrueFoundry handles routing and isolation behind the scenes.
- 3 Finally, you use specific tools—like
truefoundry_list_mcp_servers—to discover all available services and manage their lifecycle.
The bottom line is: it gives your agent a single brain that can talk to dozens of specialized AI APIs without needing separate credentials or complex routing logic in your code.
Who Is TrueFoundry MCP For?
This hub is essential for Platform Operations teams, AI Engineers, and Software Architects. If you spend more time managing API keys, dealing with vendor rate limits, or stitching together multiple service endpoints than actually building features, this is for you. You need to stop treating every LLM provider like a separate project.
Manages the central connection point for all AI services. They use truefoundry_list_mcp_servers and truefoundry_get_deployment_status to maintain visibility across hundreds of running tools.
Designs multi-model pipelines, deciding which LLM (Gemini, GPT, etc.) should handle a specific part of the workflow. They use truefoundry_list_gateway_models to map out options.
Deploys and governs custom AI tools by invoking truefoundry_deploy_mcp_server, ensuring new services run cleanly within the managed cluster.
What Changes When You Connect
- Centralized Model Routing: Don't juggle multiple vendor APIs. When you use
truefoundry_run_gateway_chat, your agent sends a query once, and TrueFoundry routes it to the best model (OpenAI, Anthropic, etc.) automatically. - Full Lifecycle Visibility: Need to know if your custom tool is running? Use
truefoundry_list_deploymentsortruefoundry_get_deployment_status. You get a single dashboard view of every deployed MCP container. - Schema Discovery on Demand: Don't guess what tools exist. Call
truefoundry_list_mcp_serversto see the entire registry, and usetruefoundry_get_mcp_server_infoto pull the exact JSON schema for any specific tool you plan to call. - Dedicated Embedding Service: Calculating vectors doesn't require a separate API key. Use
truefoundry_generate_embeddingsto keep your data processing clean and abstracted from the underlying model providers. - Zero-Friction Model Discovery: Before coding, run
truefoundry_list_gateway_models. This instantly tells you exactly which foundation models are available through the gateway without hitting any vendor documentation first.
Real-World Use Cases
Building a multi-model QA system
A data scientist needs to build an agent that answers questions using both general knowledge (GPT) and highly specialized, internal documents (Llama). Instead of writing two separate endpoints, they use TrueFoundry. They run truefoundry_list_gateway_models to confirm availability, then configure the chat flow: the initial query hits truefoundry_run_gateway_chat, routing through the necessary model mix, solving the multi-API headache instantly.
Monitoring a complex microservice mesh
An Ops team is running five different customer-facing AI tools (e.g., summarizer, classifier, chatbot). They don't want to log into five dashboards. They use truefoundry_list_deployments and truefoundry_get_deployment_status to pull all resource usage and health metrics for every MCP container in one place, solving the 'where did my service die?' problem.
Integrating a new internal tool
A development team builds a niche inventory management AI (an MCP server). They don't know how to expose it safely. Using truefoundry_deploy_mcp_server allows them to deploy the container directly into the TrueFoundry mesh, instantly making it discoverable and callable by other services via its defined schema.
Pre-flight check for new features
An architect needs to know what kind of embeddings are possible before writing any code. They run truefoundry_generate_embeddings with test text, confirming the vector format and security boundaries using the unified abstraction layer, avoiding time spent testing against a single vendor's specific API limits.
The Tradeoffs
Using chat for infrastructure checks
Trying to ask your agent, 'What are my deployed services?' through truefoundry_run_gateway_chat.
→
Chat is for conversation. For service status and deployment lists, always use the dedicated tools: truefoundry_list_deployments or truefoundry_get_deployment_status. They give you structured metrics, not conversational answers.
Hardcoding API keys
Writing code that calls OpenAI's endpoint directly and then calling Gemini's endpoint separately.
→ Don't hardcode. Connect your agent to the TrueFoundry gateway. It handles routing for you, so you only need one unified credential set.
Assuming model availability
Writing code that fails because it calls a model name that was deprecated or rate-limited on a single vendor's platform.
→
First, check what's available. Run truefoundry_list_gateway_models to get the definitive list of active models in your environment before you write a line of code.
When It Fits, When It Doesn't
Use TrueFoundry if your project involves running multiple distinct AI services or LLMs, and operational visibility is critical. You need this hub when you must balance feature velocity (rapidly connecting new APIs) against operational governance (knowing exactly what's deployed and how it runs). Specifically, if you plan to deploy custom MCP containers (truefoundry_deploy_mcp_server) or if your chat flow needs to switch between model providers—this is required.
Don't use this hub if all you need is a simple single-purpose API call (e.g., 'I just need GPT-4o embeddings, nothing else'). In that case, calling the vendor directly might be simpler. However, even for single tasks like embedding generation, using truefoundry_generate_embeddings keeps your system standardized and isolated, which is usually worth the minor overhead.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by TrueFoundry. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 8 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Managing 10+ separate AI API integrations feels like a full-time job.
Today, if you want an agent to answer questions using both OpenAI's general knowledge and Anthropic's specialized writing style, you write complex code. You handle two sets of keys, two different request bodies, and you build in logic to decide which API call runs when—and what happens if one times out.
With TrueFoundry, the process is simple. Your agent just sends a prompt through the gateway. The system handles the routing between models like Claude and OpenAI internally. You get a single, unified chat stream, no matter how many underlying providers you use.
TrueFoundry MCP Server: Control your entire ML deployment lifecycle.
Without this gateway, deploying a new microservice means managing the networking, authentication, and resource limits for that specific container on top of your existing stack. You have to manually track if it's running or if its dependencies broke.
Using `truefoundry_deploy_mcp_server` makes deployment native. The service registers itself with TrueFoundry, giving you immediate visibility via `truefoundry_list_deployments`, and the system handles resource isolation automatically.
Common Questions About TrueFoundry MCP
How do I check which LLMs are available using truefoundry_list_gateway_models? +
Running truefoundry_list_gateway_models returns a dynamic list of all foundation models currently supported by the gateway. This tells you exactly what's accessible without needing to check vendor documentation.
Is TrueFoundry better than using direct API calls for embeddings? +
Yes, it is cleaner. Using truefoundry_generate_embeddings abstracts away the provider details. You send the text once and get a vector back, without worrying if you're hitting an OpenAI or Gemini rate limit.
What should I use to monitor my deployed AI containers? +
You use truefoundry_get_deployment_status for detailed metrics on specific instances. If you want a list of all running services, run truefoundry_list_deployments.
Can I discover the schema of a custom tool before deploying it? +
Yes. Use truefoundry_get_mcp_server_info. This retrieves the exact JSON metadata for any registered MCP tool, so you know precisely what inputs and outputs to expect.
How do I authenticate before running a chat query using truefoundry_run_gateway_chat? +
You must first generate your TrueFoundry credentials token from the settings. This token secures all calls, isolating your original vendor keys completely so you don't need to manage multiple API secrets.
What metrics does truefoundry_get_deployment_status provide regarding usage caps? +
It provides detailed metric states on the orchestration matrix bounds. You can track specific resource utilization, including status and isolation limits for active deployments.
Where can I see a registry of all available MCP Tools using truefoundry_list_mcp_servers? +
The command extracts the full registry mapping of every logical MCP Tool. This lets you audit all services that TrueFoundry supports without needing to know their endpoints.
How are my documents kept private when I calculate embeddings using truefoundry_generate_embeddings? +
The calculation happens via a unified abstraction layer, ensuring secure data transmission. Your original source vendor APIs never touch your codebase or the raw input strings you pass in.
Can I route conversational streams directly via the AI agent using the Universal Gateway? +
Yes! You can orchestrate inferences parsing run_gateway_chat providing dedicated string formats mapping natively any enabled model.
Is it possible to monitor crashed services or container states? +
Absolutely. Target the instance ID and emit get_deployment_status explicitly bounding execution limits and fetching live log matrices.
Are the deployment configuration variables isolated upon server launch? +
Yes, using deploy_mcp_server dynamically provisions encapsulated boundaries. You stringify environment tokens seamlessly obscuring values into active runtimes only.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
Portkey
AI gateway observability: monitor logs, costs, and manage LLM configurations via agents.
Spellbook Legal AI
AI-powered contract drafting and review — analyze contracts, draft clauses, detect risks, and compare against 2,000+ market precedents via Spellbook.
LanceDB (Serverless Vector DB)
Manage vectorized data via LanceDB — perform similarity searches, create tables, and manage multi-modal embeddings.
You might also like
Chuanglan 253
Ultra-high volume SMS & 1-click login API — send verification codes, notifications, and bulk messages globally via Chuanglan 253.
Campaign Monitor
Manage email marketing via Campaign Monitor — track campaigns, manage subscribers, and monitor performance directly from any AI agent.
AT&T IoT
IoT Control Center -- Manage SIM devices, activation, data pools, shared plans, and connectivity diagnostics via AT&T IoT API.