# TrueFoundry MCP

> TrueFoundry is an LLM Gateway and ML deployment hub. It manages connections to over 1,000 proxy models (OpenAI, Anthropic, Gemini, Llama, etc.) and lets you run AI services without managing dozens of individual APIs or keys. You deploy custom MCP containers, route chats securely, and monitor the entire system from one place.

## Overview
- **Category:** ai-frontier
- **Price:** Free
- **Tags:** llm-gateway, model-orchestration, ml-deployment, ai-infrastructure, api-proxy

## Description

TrueFoundry acts as your central hub for AI services, managing connections to over a thousand proxy models like OpenAI, Anthropic, Gemini, and Llama. You connect your agent here; we handle routing and deployment management across every model provider. This server lets you run complex AI processes without needing separate API keys or container setups for each service.

When you're ready to chat, use `truefoundry_run_gateway_chat` to execute an inference stream. Your agent pushes one query through the unified gateway endpoint, and the system handles routing that request securely across multiple supported LLMs. Before running a chat, check what models are available by calling `truefoundry_list_gateway_models`; this pulls an exact list of every foundation model currently accessible via the gateway.

Need to know which custom tools your team has set up? You can retrieve a full registry map by using `truefoundry_list_mcp_servers`. To see what's running right now, run `truefoundry_list_deployments`, and that monitors the current array of backend topologies mapped to your account. For the deepest level of visibility, you get detailed metrics and health reports for every deployed AI service instance using `truefoundry_get_deployment_status`. You can also check what services are running by calling `truefoundry_list_deployments`.

When you need to build a custom service, you deploy new backend containers—or MCP servers—directly onto the infrastructure limits. Use `truefoundry_deploy_mcp_server` to spawn these logical processes via TrueFoundry's service mesh for your unique tools. If you wanna know what parameters that tool needs when it runs, call `truefoundry_get_mcp_server_info`, and that pulls the precise JSON metadata schema for any registered MCP tool.

For data processing, use `truefoundry_generate_embeddings`. This function calculates semantic vectors for any string you provide using a secure, unified abstraction layer. The system manages this process regardless of which model provider's embedding capabilities it uses internally.

This hub lets you manage everything in one place. You pull the full registry map with `truefoundry_list_mcp_servers` and get specific info about each container using `truefoundry_get_mcp_server_info`. It’s your single point of control for running, monitoring, and developing against a vast array of LLMs.

## Tools

### truefoundry_deploy_mcp_server
Spawns a new backend container logical process using TrueFoundry's service mesh for custom tools.

### truefoundry_generate_embeddings
Calculates semantic vectors securely using the unified abstraction layer for text data.

### truefoundry_get_deployment_status
Provides detailed metrics and health status on all running AI deployment containers.

### truefoundry_get_mcp_server_info
Extracts the precise JSON metadata and schema for any registered TrueFoundry tool.

### truefoundry_list_deployments
Monitors the current array of running backend topologies mapped to your team's account.

### truefoundry_list_gateway_models
Lists all foundation models that are currently accessible via the TrueFoundry unified AI gateway.

### truefoundry_list_mcp_servers
Retrieves a full registry map of every available logical MCP Tool within TrueFoundry.

### truefoundry_run_gateway_chat
Executes an inference chat stream by pushing model queries through the unified gateway endpoint.

## Prompt Examples

**Prompt:** 
```
List all active AI models supported natively inside my TrueFoundry gateway access instance.
```

**Response:** 
```
Executing `list_gateway_models` cleanly... Extracted dynamic boundaries resolving exactly 23 natively available providers including OpenAI and Cohere. Fetching strict context limits matrix sequentially.
```

**Prompt:** 
```
Trigger a chat payload pushing to 'openai-gpt4o' via TrueFoundry querying semantic structures bounding limits.
```

**Response:** 
```
Generating structured call via `run_gateway_chat`. The backend relayed proxy safely isolating original network keys. Response received indicates valid bounding states matching flawlessly.
```

**Prompt:** 
```
Deploy the 'supabase-mcp' node-image natively mapping strict variables onto my cluster runtime boundaries.
```

**Response:** 
```
Processing native cluster allocation bounding resources via `deploy_mcp_server`. Configuration schemas synced into TrueFoundry engine cleanly. Running logs verify successful execution state bounds mapped efficiently.
```

## Capabilities

### Route Queries Across LLMs
Your agent sends one chat request to the gateway, which then routes it securely to multiple supported models like OpenAI, Anthropic, or Gemini.

### Deploy Custom AI Services
You deploy new backend containers (MCP servers) directly onto the infrastructure limits and manage them as isolated services.

### Calculate Semantic Vectors
The system generates embedding vectors for any string using a secure, unified abstraction layer.

### List Available Models
You pull an exact list of all foundation models currently accessible through the TrueFoundry gateway.

### Monitor Service Status
The agent retrieves detailed metrics and status reports for every deployed AI service instance.

### Introspect Tool Schemas
You pull the exact JSON metadata (schema) for any registered MCP tool, helping you understand what parameters it needs.

## Use Cases

### Building a multi-model QA system
A data scientist needs to build an agent that answers questions using both general knowledge (GPT) and highly specialized, internal documents (Llama). Instead of writing two separate endpoints, they use TrueFoundry. They run `truefoundry_list_gateway_models` to confirm availability, then configure the chat flow: the initial query hits `truefoundry_run_gateway_chat`, routing through the necessary model mix, solving the multi-API headache instantly.

### Monitoring a complex microservice mesh
An Ops team is running five different customer-facing AI tools (e.g., summarizer, classifier, chatbot). They don't want to log into five dashboards. They use `truefoundry_list_deployments` and `truefoundry_get_deployment_status` to pull all resource usage and health metrics for every MCP container in one place, solving the 'where did my service die?' problem.

### Integrating a new internal tool
A development team builds a niche inventory management AI (an MCP server). They don't know how to expose it safely. Using `truefoundry_deploy_mcp_server` allows them to deploy the container directly into the TrueFoundry mesh, instantly making it discoverable and callable by other services via its defined schema.

### Pre-flight check for new features
An architect needs to know what kind of embeddings are possible before writing any code. They run `truefoundry_generate_embeddings` with test text, confirming the vector format and security boundaries using the unified abstraction layer, avoiding time spent testing against a single vendor's specific API limits.

## Benefits

- **Centralized Model Routing:** Don't juggle multiple vendor APIs. When you use `truefoundry_run_gateway_chat`, your agent sends a query once, and TrueFoundry routes it to the best model (OpenAI, Anthropic, etc.) automatically.
- **Full Lifecycle Visibility:** Need to know if your custom tool is running? Use `truefoundry_list_deployments` or `truefoundry_get_deployment_status`. You get a single dashboard view of every deployed MCP container.
- **Schema Discovery on Demand:** Don't guess what tools exist. Call `truefoundry_list_mcp_servers` to see the entire registry, and use `truefoundry_get_mcp_server_info` to pull the exact JSON schema for any specific tool you plan to call.
- **Dedicated Embedding Service:** Calculating vectors doesn't require a separate API key. Use `truefoundry_generate_embeddings` to keep your data processing clean and abstracted from the underlying model providers.
- **Zero-Friction Model Discovery:** Before coding, run `truefoundry_list_gateway_models`. This instantly tells you exactly which foundation models are available through the gateway without hitting any vendor documentation first.

## How It Works

The bottom line is: it gives your agent a single brain that can talk to dozens of specialized AI APIs without needing separate credentials or complex routing logic in your code.

1. First, get your TrueFoundry Personal Access Token and identify your dedicated cluster URL.
2. Next, configure your agent to connect through this single endpoint. When the agent runs a task (like chat or embedding generation), TrueFoundry handles routing and isolation behind the scenes.
3. Finally, you use specific tools—like `truefoundry_list_mcp_servers`—to discover all available services and manage their lifecycle.

## Frequently Asked Questions

**How do I check which LLMs are available using truefoundry_list_gateway_models?**
Running `truefoundry_list_gateway_models` returns a dynamic list of all foundation models currently supported by the gateway. This tells you exactly what's accessible without needing to check vendor documentation.

**Is TrueFoundry better than using direct API calls for embeddings?**
Yes, it is cleaner. Using `truefoundry_generate_embeddings` abstracts away the provider details. You send the text once and get a vector back, without worrying if you're hitting an OpenAI or Gemini rate limit.

**What should I use to monitor my deployed AI containers?**
You use `truefoundry_get_deployment_status` for detailed metrics on specific instances. If you want a list of *all* running services, run `truefoundry_list_deployments`.

**Can I discover the schema of a custom tool before deploying it?**
Yes. Use `truefoundry_get_mcp_server_info`. This retrieves the exact JSON metadata for any registered MCP tool, so you know precisely what inputs and outputs to expect.

**How do I authenticate before running a chat query using truefoundry_run_gateway_chat?**
You must first generate your TrueFoundry credentials token from the settings. This token secures all calls, isolating your original vendor keys completely so you don't need to manage multiple API secrets.

**What metrics does truefoundry_get_deployment_status provide regarding usage caps?**
It provides detailed metric states on the orchestration matrix bounds. You can track specific resource utilization, including status and isolation limits for active deployments.

**Where can I see a registry of all available MCP Tools using truefoundry_list_mcp_servers?**
The command extracts the full registry mapping of every logical MCP Tool. This lets you audit all services that TrueFoundry supports without needing to know their endpoints.

**How are my documents kept private when I calculate embeddings using truefoundry_generate_embeddings?**
The calculation happens via a unified abstraction layer, ensuring secure data transmission. Your original source vendor APIs never touch your codebase or the raw input strings you pass in.

**Can I route conversational streams directly via the AI agent using the Universal Gateway?**
Yes! You can orchestrate inferences parsing `run_gateway_chat` providing dedicated string formats mapping natively any enabled model.

**Is it possible to monitor crashed services or container states?**
Absolutely. Target the instance ID and emit `get_deployment_status` explicitly bounding execution limits and fetching live log matrices.

**Are the deployment configuration variables isolated upon server launch?**
Yes, using `deploy_mcp_server` dynamically provisions encapsulated boundaries. You stringify environment tokens seamlessly obscuring values into active runtimes only.