Together AI MCP. Power Multi-Modal AI with Open Source Models

Q: How do I use the Together AI MCP for document search?

You run this by first calling createembeddings on your documents to turn them into vectors. Then, when a user asks a question, you use creatererank to find the most relevant chunks of text from those stored embeddings.

Q: Can I make my AI model better using this MCP?

Yes. You manage custom training jobs by calling uploadfile and then initiating a job with createfinetune. This allows you to teach the open-source models your company's specific jargon.

Q: What is the difference between createchatcompletion and createtextcompletion?

Use createchatcompletion when you need the model to remember context from a conversation history. Use createtextcompletion for single, self-contained text generation tasks like writing an article summary.

Q: How do I ensure my model stays fast for production?

You use createendpoint. This tool establishes a dedicated, stable connection point that isolates your usage from general traffic fluctuations, guaranteeing reliable performance.

Together AI connects your AI agent to over 100 open-source models, giving you a unified platform for everything from text chat and image creation to audio transcription and model fine-tuning. It powers advanced generative AI applications without requiring you to manage any cloud infrastructure.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Give Claude and any AI agent real-world access

Generate Text and Chat Responses

Your agent can generate high-quality text responses for conversations using various open-source models.

Create Visual Media

The MCP handles generating realistic images or full videos based on simple text prompts.

Process Audio Files

You can convert spoken words into written transcripts, or turn plain text into natural-sounding speech for voiceovers.

Build Knowledge Retrieval Systems

It generates vector embeddings from documents and reranks results so your agent finds the most relevant information quickly.

Manage Model Training

You can run fine-tuning jobs, upload data files, and manage dedicated endpoints for reliable performance.

Ask an AI about this

Waiting for input…

AI Agent

What AI agents can do with Together AI: A Powerful Toolset With 27 Tools

These tools let you manage model lifecycle, generate media, process voice and text data, and run large background jobs all through one connection.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Together AI MCP

Create Audio Speech

This tool generates speech from plain text, creating voiceovers for your content.

Create Audio Transcription

It converts an uploaded audio file into a written transcript using speech-to-text...

Cancel Batch

You can stop any large, running background processing job immediately.

Create Chat Completion

This tool generates model responses by simulating a full back-and-forth chat...

Create Batch

It starts a new, large-scale asynchronous job that runs in the background over time.

Create Endpoint

You can set up a dedicated connection point to ensure your model performance never drops or slows down.

Create Fine Tune

This initiates the process of training an open-source model on your specific, proprietary dataset.

Delete Endpoint

It removes a dedicated connection point you previously set up for performance...

Delete File

This permanently deletes an uploaded file used for training or batch processing.

Delete Fine Tune

You can cancel a fine-tuning job that you started and no longer need.

Create Embeddings

It takes any block of text and converts it into numerical vector embeddings for...

Get Batch

You can check the current status and results of a specific background job.

Get Endpoint

This retrieves all the details about a dedicated model endpoint you created.

Get File

It fetches metadata and information about an uploaded file without needing to...

Get Fine Tune

You get the current status and progress report for a specific fine-tuning job.

Create Image Generation

This tool generates brand new images based on detailed text descriptions or prompts.

List Batches

You see a list of all background jobs that have been created using the system.

List Endpoints

It lists every dedicated model endpoint currently running or configured for your account.

List Files

You get a list of all data files you've uploaded to the system.

List Fine Tune Checkpoints

This lists saved versions, or checkpoints, for a fine-tuning job so you can revert...

List Fine Tunes

It gives you an overview of all the fine-tuning jobs that have been run previously.

List Models

You can see a list of every model available for use through this MCP connection.

Create Rerank

This tool reorders documents based on how relevant they are to the user's specific...

Create Text Completion

It generates extended text content for a simple prompt, ideal for articles or summaries.

Update Endpoint

You can change the status—like scaling up or down—of an existing dedicated model endpoint.

Upload File

It securely uploads a file for use in fine-tuning, evaluation, or batch processing...

Create Video Generation

This tool creates entire videos from text prompts or by animating an existing image.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Together AI MCP is compatible with Claude

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Together AI integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "together-ai-alternative": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Together AI tools with full Vinkius guardrails applied.

Together AI MCP is compatible with VS Code

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"together-ai-alternative": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on each call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Together AI, then connect any of our 5,200+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,200+ others, all in one place
Add new capabilities to your AI anytime you want
Connections are secured and governed automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog weekly

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Together AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS CLOUD

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on each call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

The headache of piecing together AI features manually.

Today, building a single feature that needs to do three things—like reading an audio file, summarizing it, and then generating promotional art—is a nightmare. You're jumping between the transcription tool, the chat API, and the image generation platform. You copy text from one dashboard into another service, manage keys for multiple providers, and spend hours just stitching the workflow together.

With this MCP, your agent handles the whole sequence inside one connection point. It takes the audio input, runs `create_audio_transcription`, passes that output to generate a summary via chat completion, and finally feeds keywords into `create_image_generation`. You get a fully functional feature without ever leaving your client.

Generating Media with Dedicated Model Operations

The biggest manual step that disappears is the juggling act between different model APIs. You used to have separate documentation and setup steps just for generating an image versus generating a video, forcing you into complex multi-step code blocks.

Now, if your workflow needs visual content, whether it's basic text prompts or full motion video, you call `create_image_generation` or `create_video_generation`. The whole process is contained and controllable from one place.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

llm

generative-ai

llama-3

image-generation

fine-tuning

What Together AI MCP does for your AI

You can connect this MCP to your agent to access the world's fastest inference cloud for open-source models. This connector gives you a complete toolkit for generative AI, handling everything from basic text chat and creating stunning images to processing audio files or training custom model checkpoints. Need to build complex search features? You generate vector embeddings and rerank documents using specialized tools.

Plus, if your application needs constant performance, you can create dedicated endpoints with predictable scaling. Whether you're building an app that talks, draws pictures, or analyzes voice recordings, this MCP keeps all the power running through a single connection point via Vinkius.

Built · Hosted · Managed by Vinkius Together AI MCP - Open Source Model Access

Server ID 019e38fc-f902-730c-94c9-64868c3fd057

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Benefits of connecting Together AI MCP

You don't worry about infrastructure. By connecting this MCP, you get immediate access to over 100 open-source models for text, image, and audio tasks.

When performance matters, you create dedicated endpoints using create_endpoint, ensuring your app never slows down due to model throttling.

Need custom intelligence? You upload data and use create_fine_tune to train a specialized model on your unique business vocabulary.

Build advanced search. Instead of simple keyword matching, you generate embeddings with create_embeddings and then refine results using create_rerank for better accuracy.

Handle large-scale workloads easily. Use the batch tools (create_batch, list_batches) to process thousands of items asynchronously without timing out your agent.

Together AI MCP use cases

01 01

Building a Podcast Summary Generator

A user uploads an hour-long interview audio file. The agent first runs create_audio_transcription to get the text transcript, then uses create_chat_completion on that text to draft five key bullet points, and finally sends those points via a messaging tool.

02 02

Creating Marketing Assets for a Product Launch

The product team inputs a core feature description. The agent uses create_image_generation to generate several visual concepts and then runs create_video_generation on the best image, all within one workflow.

03 03

Implementing Advanced Internal Knowledge Search

Instead of just searching a database, the agent takes user questions, uses create_embeddings to convert them and the documents into vectors, and then runs create_rerank to pull back the absolute most relevant internal policy document.

04 04

Automating Customer Service Voice Guides

The system takes a support article written by an expert. It uses create_audio_speech to convert that text into a professional voice guide, ready for immediate deployment.

Together AI MCP tradeoffs

What to watch out for, and the recommended way to handle each one.

Using generic LLM APIs

Avoid

The developer just calls the basic chat completion tool and assumes it has enough context or specialized knowledge for complex tasks like audio analysis.

Instead

For specific, multi-modal needs, don't rely on a single endpoint. You must use create_audio_transcription first to extract text, then feed that structured data into create_chat_completion.

Running tasks synchronously

Avoid

The developer tries to process 500 documents in a single API call because it's faster to code.

Instead

For anything over a few dozen items, you have to use the batch system. Start by calling create_batch and then monitor progress using get_batch.

Ignoring performance needs

Avoid

The application fails or slows down during peak usage hours because it's relying on shared, general-purpose resources.

Instead

Set up stability first. Use create_endpoint to secure a dedicated resource for your critical model calls, guaranteeing predictable speed.

When to use Together AI MCP

Use this MCP if your application needs to handle multiple types of AI output—for instance, generating an image and writing the accompanying alt text, or transcribing audio and summarizing it. If you're building a system that requires specialized data handling like embedding generation or fine-tuning on private documents, you need its advanced model operations. Don't use this if your only goal is simple API calls to a single general chat model; in those cases, a simpler text completion tool might suffice. But if the complexity involves media (audio/video), structured knowledge retrieval (create_embeddings), or reliable scaling (create_endpoint), then you need the power of this entire catalog.

Frequently asked questions about Together AI MCP

How do I use the Together AI MCP for document search? +

You run this by first calling create_embeddings on your documents to turn them into vectors. Then, when a user asks a question, you use create_rerank to find the most relevant chunks of text from those stored embeddings.

Can I make my AI model better using this MCP? +

Yes. You manage custom training jobs by calling upload_file and then initiating a job with create_fine_tune. This allows you to teach the open-source models your company's specific jargon.

What is the difference between `create_chat_completion` and `create_text_completion`? +

Use create_chat_completion when you need the model to remember context from a conversation history. Use create_text_completion for single, self-contained text generation tasks like writing an article summary.

Does this MCP help with large data uploads? +

It handles massive jobs using the batch tools. You start a job via create_batch, and then you monitor its progress and retrieve results later using get_batch.

How do I ensure my model stays fast for production? +

You use create_endpoint. This tool establishes a dedicated, stable connection point that isolates your usage from general traffic fluctuations, guaranteeing reliable performance.

Give Claude and any AI agent real-world access

What AI agents can do with Together AI: A Powerful Toolset With 27 Tools

Create Audio Speech

This tool generates speech from plain text, creating voiceovers for your content.

Create Audio Transcription

It converts an uploaded audio file into a written transcript using speech-to-text...

Cancel Batch

You can stop any large, running background processing job immediately.

Create Chat Completion

This tool generates model responses by simulating a full back-and-forth chat...

Create Batch

It starts a new, large-scale asynchronous job that runs in the background over time.

Create Endpoint

You can set up a dedicated connection point to ensure your model performance never drops or slows down.

Create Fine Tune

This initiates the process of training an open-source model on your specific, proprietary dataset.

Delete Endpoint

It removes a dedicated connection point you previously set up for performance...

Delete File

This permanently deletes an uploaded file used for training or batch processing.

Delete Fine Tune

You can cancel a fine-tuning job that you started and no longer need.

Create Embeddings

It takes any block of text and converts it into numerical vector embeddings for...

Get Batch

You can check the current status and results of a specific background job.

Get Endpoint

This retrieves all the details about a dedicated model endpoint you created.

Get File

It fetches metadata and information about an uploaded file without needing to...

Get Fine Tune

You get the current status and progress report for a specific fine-tuning job.

Create Image Generation

This tool generates brand new images based on detailed text descriptions or prompts.

List Batches

You see a list of all background jobs that have been created using the system.

List Endpoints

It lists every dedicated model endpoint currently running or configured for your account.

List Files

You get a list of all data files you've uploaded to the system.

List Fine Tune Checkpoints

This lists saved versions, or checkpoints, for a fine-tuning job so you can revert...

List Fine Tunes

It gives you an overview of all the fine-tuning jobs that have been run previously.

List Models

You can see a list of every model available for use through this MCP connection.

Create Rerank

This tool reorders documents based on how relevant they are to the user's specific...

Create Text Completion

It generates extended text content for a simple prompt, ideal for articles or summaries.

Update Endpoint

You can change the status—like scaling up or down—of an existing dedicated model endpoint.

Upload File

It securely uploads a file for use in fine-tuning, evaluation, or batch processing...

Create Video Generation

This tool creates entire videos from text prompts or by animating an existing image.

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)