Together AI MCP. Run open-source LLMs and ML tools in one place

Q: How do I know what models are available using the Together AI MCP?

You use the listavailablemodels tool. This instantly provides a list of all supported LLMs, letting you pick the best one for your chat or embedding task.

Q: Can I fine-tune my own model with Together AI MCP?

Yes. You start by calling createfinetunejob, providing a base model and your training data file, and then monitor the progress using listfinetunejobs.

Q: What is the difference between chatcompletion and textcompletion?

Use chatcompletion when you need multi-turn conversations that require a history of messages. Use textcompletion for simple, single-shot prompts.

Q: Does Together AI MCP handle image generation?

Yes, it handles images using the generateimage tool. Just give it a detailed text description and receive an image asset back.

Together AI connects your agent to hundreds of open-source LLMs for real-time inference, image generation, and model training. Use this MCP to generate vectors, run complex chats, or fine-tune models like Llama and Mixtral directly from any compatible client.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Give Claude and any AI agent real-world access

Run Advanced Conversations

Your agent handles multi-turn conversations using powerful open-source models by providing a simple chat history and requesting completion.

Generate Text Content

You can execute basic text generation tasks, giving the MCP a model ID and a prompt to get immediate textual output.

Create Image Assets

The MCP generates original images when you supply a detailed physical description (prompt) for an external diffusion model to use.

Prepare Data Embeddings

You can convert raw input texts into rich vector embeddings, which are ready to index in your analytical databases.

Manage Model Training

The MCP creates custom fine-tuning jobs using a base model and a specific dataset file, and you can track the status of those jobs.

Discover Available Models

You list all models available on the Together network to find the best engine for your NLP or vision task.

Ask an AI about this

Waiting for input…

AI Agent

What AI agents can do with Together AI with 7 Tools

These tools let you run model inference for chatting, text generation, image creation, embedding vectorization, and managing custom model training jobs.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Together AI MCP

Chat Completion

Executes a multi-turn conversation using specified Together AI models and message history.

Text Completion

Performs standard text generation by receiving only a model ID and an initial prompt.

Create Finetune Job

Initiates a new model fine-tuning job using a specified base model and training...

Generate Embeddings

Converts an array of input texts into rich vector embeddings for use in databases.

Generate Image

Creates a visual image by translating a detailed descriptive prompt into a picture...

List Finetune Jobs

Retrieves and shows the current status of all fine-tuning jobs you've created.

List Available Models

Lists every AI model currently supported on the Together AI network for your review.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Together AI MCP is compatible with Claude

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Together AI integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "together-ai": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Together AI tools with full Vinkius guardrails applied.

Together AI MCP is compatible with VS Code

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"together-ai": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on each call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Together AI, then connect any of our 5,200+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,200+ others, all in one place
Add new capabilities to your AI anytime you want
Connections are secured and governed automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog weekly

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Together AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS CLOUD

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on each call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

The Headache of Model Tooling

Today, integrating different AI functions means a lot of manual work. You run the chat in one window, then copy text out to a separate vector database UI to generate embeddings, and if you need an image, you have to switch over to an art generator's web portal. It’s a constant cycle of copying, pasting, and switching tabs just to get one feature working.

With this MCP, that manual handoff disappears. You tell your agent what you want—whether it’s generating vectors using `generate_embeddings` or getting text completions via `text_completion`—and the model runs everything internally. The result appears right where you asked for it.

Together AI: Model Operations

The specific manual steps that vanish include setting up separate API keys for different models and manually tracking job states across multiple vendor dashboards. You also stop having to decide if the model you are using is right for the task.

Now, your agent handles all of it. You simply ask the MCP to manage the workflow—for example, running `chat_completion` first, then asking it to summarize the output and generate embeddings with `generate_embeddings`. It’s one continuous flow.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

llm

model-inference

fine-tuning

open-source-ai

machine-learning

api-deployment

What Together AI MCP does for your AI

Need to get bleeding-edge AI models into your daily workflow? This Together AI MCP connects your agent to an entire library of open-source LLMs. You can query powerful models—like Llama, Mixtral, and others—to run chats or perform basic text completions without leaving your chat environment. It's built for developers who need world-class inference speed right now.

Beyond just chatting, you can generate rich vector embeddings instantly from raw text logs to populate any analytical database. Need visuals? Instruct the MCP to create images using detailed descriptions. You can also provision and track custom fine-tuning jobs by pointing to a base model and a dataset file. Once connected via Vinkius, your agent gains access to this full suite of capabilities, letting you manage everything from basic text generation to complex model training cycles.

Built · Hosted · Managed by Vinkius Together AI - LLM Inference & Fine-Tuning MCP

Server ID 019d7613-8fef-713a-ac52-03cbd6e1202c

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Benefits of connecting Together AI MCP

Stop switching between dashboards. You can generate embeddings or run chat completions using the generate_embeddings tool, all from your agent's prompt.

Manage entire model lifecycles—from initial testing to production fine-tuning. Use create_finetune_job and then check status with list_finetune_jobs without leaving your workflow.

Need a visual asset? Simply call generate_image by providing a detailed prompt; you get an image file back, not just text.

Explore the best model for any task. Use list_available_models to see hundreds of open-source options before running a single inference.

The chat_completion tool handles complex conversational flow, making your agent feel much more natural than simple prompt/response cycles.

Together AI MCP use cases

01 01

Building a Retrieval System

An engineer needs to index thousands of internal documents. Instead of writing a dedicated script, they ask their agent to use generate_embeddings on the raw text chunks and pipe those vectors directly into their vector store.

02 02

Creating Content for Marketing

A marketing specialist needs an illustration for a blog post. They prompt their agent, asking it to use generate_image with a detailed description (e.g., 'a futuristic cityscape at sunset'), and the image appears instantly.

03 03

Updating a Core Model

A machine learning engineer wants to adapt an open-source LLM for internal jargon. They use create_finetune_job with their base model ID and dataset, then monitor progress using list_finetune_jobs.

04 04

Testing Model Alternatives

A developer wants to compare Llama 3 against Mixtral for a chat feature. They use the agent's ability to run completions (chat_completion) multiple times in one session, comparing outputs side-by-side.

Together AI MCP tradeoffs

What to watch out for, and the recommended way to handle each one.

Hardcoding API Calls

Avoid

Writing complex code blocks with explicit model endpoints and separate libraries just to run a single chat query or generate vectors.

Instead

Use the agent's built-in tools. Simply tell your agent to chat_completion after specifying the desired model ID, letting the MCP handle the boilerplate API connection.

Ignoring Model Variety

Avoid

Assuming one powerful LLM is good enough for everything—using a single endpoint for chat, image generation, and embedding creation.

Instead

Use list_available_models to select the best specialized tool. For instance, use generate_embeddings instead of asking your main chat agent to do vector math.

Manual Job Tracking

Avoid

After submitting a fine-tuning job, having to log into a separate web console every few minutes just to see if the process succeeded or failed.

Instead

Use list_finetune_jobs inside your agent. You submit the job with create_finetune_job, and then check its status within the same conversational thread.

When to use Together AI MCP

Use this MCP if your primary need is accessing a wide, current selection of open-source models for diverse tasks—chatting, embedding, image creation, or training. It’s ideal when you need to prototype quickly and test multiple model architectures in one place. Don't use it just because you want an LLM; the value here is in its breadth (the many available tools). If your goal is only simple text generation with a single provider and no other ML needs, another dedicated completion tool might suffice. But if you are building anything that requires data preparation (embeddings), visual assets (generate_image), or model customization (create_finetune_job), this MCP is necessary.

Frequently asked questions about Together AI MCP

How do I know what models are available using the Together AI MCP? +

You use the list_available_models tool. This instantly provides a list of all supported LLMs, letting you pick the best one for your chat or embedding task.

Can I fine-tune my own model with Together AI MCP? +

Yes. You start by calling create_finetune_job, providing a base model and your training data file, and then monitor the progress using list_finetune_jobs.

What is the difference between chat_completion and text_completion? +

Use chat_completion when you need multi-turn conversations that require a history of messages. Use text_completion for simple, single-shot prompts.

Does Together AI MCP handle image generation? +

Yes, it handles images using the generate_image tool. Just give it a detailed text description and receive an image asset back.

Is this only for coding tasks? +

No. While great for developers, you can use this MCP for anything that needs complex AI: data vectorization (generate_embeddings), content creation, or model training.

View all recipes →

Fine-Tune AI Models Using MCP Servers

GPT-4 costs $30 per 1M tokens for your classification task , fine-tune a $0.20/M model on Together AI that scores 96% accuracy, track every experiment in W&B, and save $29.80 per million tokens

Together Ai Weights Biases Google Sheets

View all recipes

Give Claude and any AI agent real-world access

What AI agents can do with Together AI with 7 Tools

Chat Completion

Executes a multi-turn conversation using specified Together AI models and message history.

Text Completion

Performs standard text generation by receiving only a model ID and an initial prompt.

Create Finetune Job

Initiates a new model fine-tuning job using a specified base model and training...

Generate Embeddings

Converts an array of input texts into rich vector embeddings for use in databases.

Generate Image

Creates a visual image by translating a detailed descriptive prompt into a picture...

List Finetune Jobs

Retrieves and shows the current status of all fine-tuning jobs you've created.

List Available Models

Lists every AI model currently supported on the Together AI network for your review.

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

The Headache of Model Tooling

Together AI: Model Operations

llm

model-inference

fine-tuning

open-source-ai

machine-learning

api-deployment

What Together AI MCP does for your AI

How to set up Together AI MCP

Who uses Together AI MCP

Benefits of connecting Together AI MCP

Together AI MCP use cases

Building a Retrieval System

Creating Content for Marketing

Updating a Core Model

Testing Model Alternatives

Together AI MCP tradeoffs

Hardcoding API Calls

Ignoring Model Variety

Manual Job Tracking

When to use Together AI MCP

Frequently asked questions about Together AI MCP

Powerful workflows you can unlock today

Fine-Tune AI Models Using MCP Servers