# Together AI MCP

> Together AI connects your AI agent to over 100 open-source models, giving you a unified platform for everything from text chat and image creation to audio transcription and model fine-tuning. It powers advanced generative AI applications without requiring you to manage any cloud infrastructure.

## Overview
- **Category:** ai-frontier
- **Price:** Free
- **Tags:** llm, generative-ai, llama-3, image-generation, fine-tuning

## Description

You can connect this MCP to your agent to access the world's fastest inference cloud for open-source models. This connector gives you a complete toolkit for generative AI, handling everything from basic text chat and creating stunning images to processing audio files or training custom model checkpoints. Need to build complex search features? You generate vector embeddings and rerank documents using specialized tools. Plus, if your application needs constant performance, you can create dedicated endpoints with predictable scaling. Whether you're building an app that talks, draws pictures, or analyzes voice recordings, this MCP keeps all the power running through a single connection point via Vinkius.



## Tools

### create_audio_speech
This tool generates speech from plain text, creating voiceovers for your content.

### create_audio_transcription
It converts an uploaded audio file into a written transcript using speech-to-text technology.

### cancel_batch
You can stop any large, running background processing job immediately.

### create_chat_completion
This tool generates model responses by simulating a full back-and-forth chat conversation.

### create_batch
It starts a new, large-scale asynchronous job that runs in the background over time.

### create_endpoint
You can set up a dedicated connection point to ensure your model performance never drops or slows down.

### create_fine_tune
This initiates the process of training an open-source model on your specific, proprietary dataset.

### delete_endpoint
It removes a dedicated connection point you previously set up for performance stability.

### delete_file
This permanently deletes an uploaded file used for training or batch processing.

### delete_fine_tune
You can cancel a fine-tuning job that you started and no longer need.

### create_embeddings
It takes any block of text and converts it into numerical vector embeddings for search indexing.

### get_batch
You can check the current status and results of a specific background job.

### get_endpoint
This retrieves all the details about a dedicated model endpoint you created.

### get_file
It fetches metadata and information about an uploaded file without needing to download it.

### get_fine_tune
You get the current status and progress report for a specific fine-tuning job.

### create_image_generation
This tool generates brand new images based on detailed text descriptions or prompts.

### list_batches
You see a list of all background jobs that have been created using the system.

### list_endpoints
It lists every dedicated model endpoint currently running or configured for your account.

### list_files
You get a list of all data files you've uploaded to the system.

### list_fine_tune_checkpoints
This lists saved versions, or checkpoints, for a fine-tuning job so you can revert if needed.

### list_fine_tunes
It gives you an overview of all the fine-tuning jobs that have been run previously.

### list_models
You can see a list of every model available for use through this MCP connection.

### create_rerank
This tool reorders documents based on how relevant they are to the user's specific query.

### create_text_completion
It generates extended text content for a simple prompt, ideal for articles or summaries.

### update_endpoint
You can change the status—like scaling up or down—of an existing dedicated model endpoint.

### upload_file
It securely uploads a file for use in fine-tuning, evaluation, or batch processing tasks.

### create_video_generation
This tool creates entire videos from text prompts or by animating an existing image.

## Prompt Examples

**Prompt:** 
```
Generate a chat completion using meta-llama/Llama-3.3-70B-Instruct-Turbo explaining quantum computing.
```

**Response:** 
```
I've initiated the request to Llama-3.3-70B. Quantum computing uses qubits to perform calculations that are impossible for classical computers by leveraging superposition and entanglement...
```

**Prompt:** 
```
Create an image of a futuristic laboratory using the black-forest-labs/FLUX.1-schnell model.
```

**Response:** 
```
Generating image... I've successfully created the image of a futuristic laboratory. You can access the result at the provided URL or as a base64 string.
```

**Prompt:** 
```
List all available models on Together AI.
```

**Response:** 
```
Fetching model list... You have access to over 100 models, including Llama-3.3-70B, Mixtral-8x7B, Flux.1, and various embedding models. Would you like to filter by type?
```

## Capabilities

### Generate Text and Chat Responses
Your agent can generate high-quality text responses for conversations using various open-source models.

### Create Visual Media
The MCP handles generating realistic images or full videos based on simple text prompts.

### Process Audio Files
You can convert spoken words into written transcripts, or turn plain text into natural-sounding speech for voiceovers.

### Build Knowledge Retrieval Systems
It generates vector embeddings from documents and reranks results so your agent finds the most relevant information quickly.

### Manage Model Training
You can run fine-tuning jobs, upload data files, and manage dedicated endpoints for reliable performance.

## Use Cases

### Building a Podcast Summary Generator
A user uploads an hour-long interview audio file. The agent first runs `create_audio_transcription` to get the text transcript, then uses `create_chat_completion` on that text to draft five key bullet points, and finally sends those points via a messaging tool.

### Creating Marketing Assets for a Product Launch
The product team inputs a core feature description. The agent uses `create_image_generation` to generate several visual concepts and then runs `create_video_generation` on the best image, all within one workflow.

### Implementing Advanced Internal Knowledge Search
Instead of just searching a database, the agent takes user questions, uses `create_embeddings` to convert them and the documents into vectors, and then runs `create_rerank` to pull back the absolute most relevant internal policy document.

### Automating Customer Service Voice Guides
The system takes a support article written by an expert. It uses `create_audio_speech` to convert that text into a professional voice guide, ready for immediate deployment.

## Benefits

- You don't worry about infrastructure. By connecting this MCP, you get immediate access to over 100 open-source models for text, image, and audio tasks.
- When performance matters, you create dedicated endpoints using `create_endpoint`, ensuring your app never slows down due to model throttling.
- Need custom intelligence? You upload data and use `create_fine_tune` to train a specialized model on your unique business vocabulary.
- Build advanced search. Instead of simple keyword matching, you generate embeddings with `create_embeddings` and then refine results using `create_rerank` for better accuracy.
- Handle large-scale workloads easily. Use the batch tools (`create_batch`, `list_batches`) to process thousands of items asynchronously without timing out your agent.

## How It Works

The bottom line is that you use your AI client to trigger advanced generative tasks without having to worry about managing underlying model servers.

1. First, subscribe to this MCP and provide your Together AI API Key.
2. Next, your agent uses the connection to call specific tools—for example, telling it to create a video or generate embeddings.
3. Finally, you get back the resulting data payload, like an image URL or a transcript file path.

## Frequently Asked Questions

**How do I use the Together AI MCP for document search?**
You run this by first calling `create_embeddings` on your documents to turn them into vectors. Then, when a user asks a question, you use `create_rerank` to find the most relevant chunks of text from those stored embeddings.

**Can I make my AI model better using this MCP?**
Yes. You manage custom training jobs by calling `upload_file` and then initiating a job with `create_fine_tune`. This allows you to teach the open-source models your company's specific jargon.

**What is the difference between `create_chat_completion` and `create_text_completion`?**
Use `create_chat_completion` when you need the model to remember context from a conversation history. Use `create_text_completion` for single, self-contained text generation tasks like writing an article summary.

**Does this MCP help with large data uploads?**
It handles massive jobs using the batch tools. You start a job via `create_batch`, and then you monitor its progress and retrieve results later using `get_batch`.

**How do I ensure my model stays fast for production?**
You use `create_endpoint`. This tool establishes a dedicated, stable connection point that isolates your usage from general traffic fluctuations, guaranteeing reliable performance.