TrueFoundry MCP for AI. Route models and manage AI deployments from one hub.
Works with every AI agent you already use
…and any MCP-compatible client








Connect to your AI in seconds.
TrueFoundry is an LLM Gateway and ML deployment hub. It manages connections to over 1,000 proxy models (OpenAI, Anthropic, Gemini, Llama, etc.) and lets you run AI services without managing dozens of individual APIs or keys.
You deploy custom MCP containers, route chats securely, and monitor the entire system from one place.
What your AI can do
Truefoundry deploy mcp server
Spawns a new backend container logical process using TrueFoundry's service mesh for custom tools.
Truefoundry generate embeddings
Calculates semantic vectors securely using the unified abstraction layer for text data.
Truefoundry get deployment status
Provides detailed metrics and health status on all running AI deployment containers.
Your agent sends one chat request to the gateway, which then routes it securely to multiple supported models like OpenAI, Anthropic, or Gemini.
You deploy new backend containers (MCP servers) directly onto the infrastructure limits and manage them as isolated services.
The system generates embedding vectors for any string using a secure, unified abstraction layer.
You pull an exact list of all foundation models currently accessible through the TrueFoundry gateway.
The agent retrieves detailed metrics and status reports for every deployed AI service instance.
You pull the exact JSON metadata (schema) for any registered MCP tool, helping you understand what parameters it needs.
Ask an AI about this
Waiting for input…
TrueFoundry MCP Server: 8 Tools for Model & Deployment Ops
This server gives you the tools to manage, route, and deploy every aspect of your LLM infrastructure using a single unified gateway.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using TrueFoundry on VinkiusTruefoundry Deploy Mcp Server
Spawns a new backend container logical process using TrueFoundry's service mesh for custom tools.
Truefoundry Generate Embeddings
Calculates semantic vectors securely using the unified abstraction layer for text...
Truefoundry Get Deployment Status
Provides detailed metrics and health status on all running AI deployment containers.
Truefoundry Get Mcp Server Info
Extracts the precise JSON metadata and schema for any registered TrueFoundry tool.
Truefoundry List Deployments
Monitors the current array of running backend topologies mapped to your team's...
Truefoundry List Gateway Models
Lists all foundation models that are currently accessible via the TrueFoundry unified AI gateway.
Truefoundry List Mcp Servers
Retrieves a full registry map of every available logical MCP Tool within TrueFoundry.
Truefoundry Run Gateway Chat
Executes an inference chat stream by pushing model queries through the unified...
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with TrueFoundry, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,100+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by TrueFoundry. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This connection provides 8 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.
Managing 10+ separate AI API integrations feels like a full-time job.
Today, if you want an agent to answer questions using both OpenAI's general knowledge and Anthropic's specialized writing style, you write complex code. You handle two sets of keys, two different request bodies, and you build in logic to decide which API call runs when—and what happens if one times out.
With TrueFoundry, the process is simple. Your agent just sends a prompt through the gateway. The system handles the routing between models like Claude and OpenAI internally. You get a single, unified chat stream, no matter how many underlying providers you use.
TrueFoundry MCP Server: Control your entire ML deployment lifecycle.
Without this gateway, deploying a new microservice means managing the networking, authentication, and resource limits for that specific container on top of your existing stack. You have to manually track if it's running or if its dependencies broke.
Using `truefoundry_deploy_mcp_server` makes deployment native. The service registers itself with TrueFoundry, giving you immediate visibility via `truefoundry_list_deployments`, and the system handles resource isolation automatically.
What your AI can actually do with this
TrueFoundry acts as your central hub for AI services, managing connections to over a thousand proxy models like OpenAI, Anthropic, Gemini, and Llama. You connect your agent here; we handle routing and deployment management across every model provider. This server lets you run complex AI processes without needing separate API keys or container setups for each service.
When you're ready to chat, use truefoundry_run_gateway_chat to execute an inference stream. Your agent pushes one query through the unified gateway endpoint, and the system handles routing that request securely across multiple supported LLMs. Before running a chat, check what models are available by calling truefoundry_list_gateway_models; this pulls an exact list of every foundation model currently accessible via the gateway.
Need to know which custom tools your team has set up? You can retrieve a full registry map by using truefoundry_list_mcp_servers. To see what's running right now, run truefoundry_list_deployments, and that monitors the current array of backend topologies mapped to your account. For the deepest level of visibility, you get detailed metrics and health reports for every deployed AI service instance using truefoundry_get_deployment_status.
You can also check what services are running by calling truefoundry_list_deployments.
When you need to build a custom service, you deploy new backend containers—or MCP servers—directly onto the infrastructure limits. Use truefoundry_deploy_mcp_server to spawn these logical processes via TrueFoundry's service mesh for your unique tools. If you wanna know what parameters that tool needs when it runs, call truefoundry_get_mcp_server_info, and that pulls the precise JSON metadata schema for any registered MCP tool.
For data processing, use truefoundry_generate_embeddings. This function calculates semantic vectors for any string you provide using a secure, unified abstraction layer. The system manages this process regardless of which model provider's embedding capabilities it uses internally.
This hub lets you manage everything in one place. You pull the full registry map with truefoundry_list_mcp_servers and get specific info about each container using truefoundry_get_mcp_server_info. It’s your single point of control for running, monitoring, and developing against a vast array of LLMs.
019d7616-4f6e-7239-bc09-2265cc645c48 Here's how it actually works
The bottom line is: it gives your agent a single brain that can talk to dozens of specialized AI APIs without needing separate credentials or complex routing logic in your code.
First, get your TrueFoundry Personal Access Token and identify your dedicated cluster URL.
Next, configure your agent to connect through this single endpoint. When the agent runs a task (like chat or embedding generation), TrueFoundry handles routing and isolation behind the scenes.
Finally, you use specific tools—like truefoundry_list_mcp_servers—to discover all available services and manage their lifecycle.
Who is this actually for?
This hub is essential for Platform Operations teams, AI Engineers, and Software Architects. If you spend more time managing API keys, dealing with vendor rate limits, or stitching together multiple service endpoints than actually building features, this is for you. You need to stop treating every LLM provider like a separate project.
Manages the central connection point for all AI services. They use truefoundry_list_mcp_servers and truefoundry_get_deployment_status to maintain visibility across hundreds of running tools.
Designs multi-model pipelines, deciding which LLM (Gemini, GPT, etc.) should handle a specific part of the workflow. They use truefoundry_list_gateway_models to map out options.
Deploys and governs custom AI tools by invoking truefoundry_deploy_mcp_server, ensuring new services run cleanly within the managed cluster.
What Changes When You Connect
Centralized Model Routing: Don't juggle multiple vendor APIs. When you use truefoundry_run_gateway_chat, your agent sends a query once, and TrueFoundry routes it to the best model (OpenAI, Anthropic, etc.) automatically.
Full Lifecycle Visibility: Need to know if your custom tool is running? Use truefoundry_list_deployments or truefoundry_get_deployment_status. You get a single dashboard view of every deployed MCP container.
Schema Discovery on Demand: Don't guess what tools exist. Call truefoundry_list_mcp_servers to see the entire registry, and use truefoundry_get_mcp_server_info to pull the exact JSON schema for any specific tool you plan to call.
Dedicated Embedding Service: Calculating vectors doesn't require a separate API key. Use truefoundry_generate_embeddings to keep your data processing clean and abstracted from the underlying model providers.
Zero-Friction Model Discovery: Before coding, run truefoundry_list_gateway_models. This instantly tells you exactly which foundation models are available through the gateway without hitting any vendor documentation first.
See it in action
Building a multi-model QA system
A data scientist needs to build an agent that answers questions using both general knowledge (GPT) and highly specialized, internal documents (Llama). Instead of writing two separate endpoints, they use TrueFoundry. They run truefoundry_list_gateway_models to confirm availability, then configure the chat flow: the initial query hits truefoundry_run_gateway_chat, routing through the necessary model mix, solving the multi-API headache instantly.
Monitoring a complex microservice mesh
An Ops team is running five different customer-facing AI tools (e.g., summarizer, classifier, chatbot). They don't want to log into five dashboards. They use truefoundry_list_deployments and truefoundry_get_deployment_status to pull all resource usage and health metrics for every MCP container in one place, solving the 'where did my service die?' problem.
Integrating a new internal tool
A development team builds a niche inventory management AI (an MCP server). They don't know how to expose it safely. Using truefoundry_deploy_mcp_server allows them to deploy the container directly into the TrueFoundry mesh, instantly making it discoverable and callable by other services via its defined schema.
Pre-flight check for new features
An architect needs to know what kind of embeddings are possible before writing any code. They run truefoundry_generate_embeddings with test text, confirming the vector format and security boundaries using the unified abstraction layer, avoiding time spent testing against a single vendor's specific API limits.
The honest tradeoffs
Using chat for infrastructure checks
Trying to ask your agent, 'What are my deployed services?' through truefoundry_run_gateway_chat.
Chat is for conversation. For service status and deployment lists, always use the dedicated tools: truefoundry_list_deployments or truefoundry_get_deployment_status. They give you structured metrics, not conversational answers.
Hardcoding API keys
Writing code that calls OpenAI's endpoint directly and then calling Gemini's endpoint separately.
Don't hardcode. Connect your agent to the TrueFoundry gateway. It handles routing for you, so you only need one unified credential set.
Assuming model availability
Writing code that fails because it calls a model name that was deprecated or rate-limited on a single vendor's platform.
First, check what's available. Run truefoundry_list_gateway_models to get the definitive list of active models in your environment before you write a line of code.
When It Fits, When It Doesn't
Use TrueFoundry if your project involves running multiple distinct AI services or LLMs, and operational visibility is critical. You need this hub when you must balance feature velocity (rapidly connecting new APIs) against operational governance (knowing exactly what's deployed and how it runs). Specifically, if you plan to deploy custom MCP containers (truefoundry_deploy_mcp_server) or if your chat flow needs to switch between model providers—this is required.
Don't use this hub if all you need is a simple single-purpose API call (e.g., 'I just need GPT-4o embeddings, nothing else'). In that case, calling the vendor directly might be simpler. However, even for single tasks like embedding generation, using truefoundry_generate_embeddings keeps your system standardized and isolated, which is usually worth the minor overhead.
Questions you might have
How do I check which LLMs are available using truefoundry_list_gateway_models? +
Running truefoundry_list_gateway_models returns a dynamic list of all foundation models currently supported by the gateway. This tells you exactly what's accessible without needing to check vendor documentation.
Is TrueFoundry better than using direct API calls for embeddings? +
Yes, it is cleaner. Using truefoundry_generate_embeddings abstracts away the provider details. You send the text once and get a vector back, without worrying if you're hitting an OpenAI or Gemini rate limit.
What should I use to monitor my deployed AI containers? +
You use truefoundry_get_deployment_status for detailed metrics on specific instances. If you want a list of all running services, run truefoundry_list_deployments.
Can I discover the schema of a custom tool before deploying it? +
Yes. Use truefoundry_get_mcp_server_info. This retrieves the exact JSON metadata for any registered MCP tool, so you know precisely what inputs and outputs to expect.
How do I authenticate before running a chat query using truefoundry_run_gateway_chat? +
You must first generate your TrueFoundry credentials token from the settings. This token secures all calls, isolating your original vendor keys completely so you don't need to manage multiple API secrets.
What metrics does truefoundry_get_deployment_status provide regarding usage caps? +
It provides detailed metric states on the orchestration matrix bounds. You can track specific resource utilization, including status and isolation limits for active deployments.
Where can I see a registry of all available MCP Tools using truefoundry_list_mcp_servers? +
The command extracts the full registry mapping of every logical MCP Tool. This lets you audit all services that TrueFoundry supports without needing to know their endpoints.
How are my documents kept private when I calculate embeddings using truefoundry_generate_embeddings? +
The calculation happens via a unified abstraction layer, ensuring secure data transmission. Your original source vendor APIs never touch your codebase or the raw input strings you pass in.
Can I route conversational streams directly via the AI agent using the Universal Gateway? +
Yes! You can orchestrate inferences parsing run_gateway_chat providing dedicated string formats mapping natively any enabled model.
Is it possible to monitor crashed services or container states? +
Absolutely. Target the instance ID and emit get_deployment_status explicitly bounding execution limits and fetching live log matrices.
Are the deployment configuration variables isolated upon server launch? +
Yes, using deploy_mcp_server dynamically provisions encapsulated boundaries. You stringify environment tokens seamlessly obscuring values into active runtimes only.
We've already built the connector for TrueFoundry. Just plug in your AI agents and start using Vinkius.
No hosting. No infrastructure. No complex setup.
All 8 tools are live and waiting.
You're up and running in seconds.
Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.
Built, hosted, and secured by Vinkius. You just connect and go.