Vinkius
Together AI

Together AI MCP. Run open-source LLMs and ML pipelines directly in your agent.

Claude Claude
ChatGPT ChatGPT
Cursor Cursor
Gemini Gemini
Windsurf Windsurf
VS Code VS Code
JetBrains JetBrains
Vercel Vercel
See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Together AI MCP on Cursor AI Code Editor MCP Client Together AI MCP on Claude Desktop App MCP Integration Together AI MCP on OpenAI Agents SDK MCP Compatible Together AI MCP on Visual Studio Code MCP Extension Client Together AI MCP on GitHub Copilot AI Agent MCP Integration Together AI MCP on Google Gemini AI MCP Integration Together AI MCP on Lovable AI Development MCP Client Together AI MCP on Mistral AI Agents MCP Compatible Together AI MCP on Amazon AWS Bedrock MCP Support

Just plug in your AI agents and start using Vinkius.

Together AI connects your local agent to dozens of open-source models and ML services. You can instantly generate chat completions, create vector embeddings for RAG pipelines, or fine-tune custom LLMs—all through one API endpoint.

It lets you query Llama, Mixtral, and more from a single place without leaving your IDE.

What your AI agents can do

Chat completion

Runs a multi-turn conversation using an open-source model, accepting a model ID and message history array.

Create finetune job

Starts the training process for a custom LLM by specifying a base model and the dataset to train on.

Generate embeddings

Converts a list of input strings into numerical vector embeddings using a specified embedding model ID.

+ 4 more capabilities included
List available models

Checks the Together AI network to find all currently supported open-source LLMs and diffusion models.

Run chat completions

Executes multi-turn conversational cycles using advanced, specified open-source models (e.g., Llama 3).

Generate text embeddings

Converts input texts into numerical vectors that capture semantic meaning for database indexing.

Create images from prompts

Uses external diffusion models to generate visual media based on a detailed text description.

Start fine-tuning jobs

Initiates a custom training run by pointing the system to a base model and your specific dataset file.

Check job statuses

Retrieves the current status of any existing or previously submitted model fine-tuning jobs.

Supported MCP Clients

OAuth 2.0 Compatible
Vinkius runs on Claude Claude
Vinkius runs on ChatGPT ChatGPT
Vinkius runs on Cursor Cursor
Vinkius runs on Gemini Gemini
Vinkius runs on VS Code VS Code
Vinkius runs on JetBrains JetBrains
Vinkius runs on Vercel Vercel
Vinkius runs on Zendesk Zendesk
+ other MCP clients
Included with Plan

Waiting for input…

AI Agent

Together AI MCP Server: 7 Tools for Model Operations

Master model execution, embedding generation, and custom training by accessing seven specialized tools within your agent.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Together AI on Vinkius
chat019d7613

chat completion

Runs a multi-turn conversation using an open-source model, accepting a model ID and message history array.

create019d7613

create finetune job

Starts the training process for a custom LLM by specifying a base model and the dataset to train on.

generate019d7613

generate embeddings

Converts a list of input strings into numerical vector embeddings using a specified embedding model ID.

generate019d7613

generate image

Creates an image file by sending a detailed descriptive text prompt to the external diffusion model.

list019d7613

list available models

Returns a list of all LLMs and open-source models currently supported on the Together AI platform.

list019d7613

list finetune jobs

Retrieves a list of all fine-tuning jobs, allowing you to check their current status.

text019d7613

text completion

Executes a single text generation request using an open-source model based on a provided prompt and model ID.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

  • Import from OpenAPI, Swagger, or YAML specs
  • Create Agent Skills with progressive disclosure
  • Deploy to edge with MCPFusion framework
  • Built in DLP, auth, and compliance on every call
  • Real time usage dashboard and cost metering
  • Publish to catalog or keep private
Start building

Make Your AI Do More

Start with Together AI, then connect any of our 4,800+ other servers whenever your AI needs more. One click, no limits.

  • Use this MCP plus 4,800+ others, all in one place
  • Add new capabilities to your AI anytime you want
  • Every connection is secured and compliant automatically
  • Track usage and costs across all your servers
  • Works with Claude, ChatGPT, Cursor, and more
  • New servers added to the catalog every week
Together AI MCP server cover

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Together AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 7 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Manually setting up complex AI pipelines takes too many steps.

Right now, if you want to build an advanced system—say, something that needs to read a document and then chat about it—you're dealing with chaos. You have to set up a data pipeline in one service, get the API key for embeddings from another, and then call the LLM model using yet a third provider's credentials. It’s copy-pasting keys everywhere just to make two things talk.

With this MCP server, you keep it local. Your agent handles the whole sequence. You send the text in, the tool generates embeddings with `generate_embeddings`, and then your chat completion runs using those vectors—all within one conversation flow. It's clean.

Together AI lets you run specialized model jobs instantly.

Before this, if you wanted to train a custom LLM on your company's data, the process was huge. You had to provision compute clusters, upload massive datasets manually, and wait hours for status updates in a separate web panel. It was slow and siloed.

Now, you just point to the base model ID and the dataset file using `create_finetune_job`. The job starts, and you track it right there with `list_finetune_jobs`. It's that simple.

What you can do with this MCP connector

Look, you've got an agent running locally, and it needs muscle that doesn't cost a fortune or tie you down to some closed system. This MCP server connects your setup directly to dozens of open-source models and ML services from the Together AI network. It gives you high-speed inference for big language models like Llama 3 and Mixtral.

You can run everything—from simple prompts to full custom model training runs—all through one API endpoint, right inside your IDE.

When you need to figure out what's available, start with the list_available_models tool. It checks the entire Together AI network and spits back a comprehensive list of every open-source LLM and diffusion model they support. This lets you know exactly which engine—whether it's for natural language processing or image generation—you need to tackle your current task.

For basic text tasks, you've got two ways to go. If you just need a quick answer based on a single prompt, use text_completion. You just send over the specific model ID and the prompt, and it spits out the requested text. But if you’re building a chat interface or running a complex dialogue that requires remembering context, you'll want to run a multi-turn conversation using chat_completion.

This tool handles the entire message history—you pass in the model ID along with an array of previous messages—so your agent doesn't forget what was said two turns ago.

If your goal is building a Retrieval Augmented Generation (RAG) pipeline, you gotta deal with embeddings. Use generate_embeddings to convert any list of raw input strings into numerical vector embeddings. You just specify the embedding model ID, and it handles turning that plain text into vectors ready for database indexing.

This is how you make your documents searchable.

Need some visual flair? If you're working on anything graphical, generate_image uses external diffusion models to create image files. All you gotta do is send over a detailed descriptive text prompt—the more specific you are about what you want the picture to look like, the better it turns out.

For custom AI development, you have two tools managing the entire lifecycle of fine-tuning. First, when your open-source model isn't quite hitting the mark for your niche use case, you kick off a new training run using create_finetune_job. This tool takes two key inputs: the base model ID and the specific dataset you want it to train on.

That starts the whole process.

Once that job is running in the background—and it will take time—you need to know if it's stuck or done. Use list_finetune_jobs to retrieve a list of all your submitted fine-tuning jobs. This lets you check the current status of every single job, giving you visibility into whether they're queued, running, or finished.

It covers everything from checking existing runs to listing them for an audit.

Built · Hosted · Managed by Vinkius Together AI MCP Server - Run Open-Source Model Inference Server ID 019d7613-8fef-713a-ac52-03cbd6e1202c
Vinkius Inspector
Compliance Grade A+
Score 100/100
Vinkius Inspector Badge — Score 100/100

Common Questions About Together AI MCP

How do I check which open-source LLMs are available? +

You run the list_available_models tool. This gives you a list of every model ID and its capabilities right now, letting you pick the best engine for your job.

Is `chat_completion` better than `text_completion`? +

chat_completion is almost always what you want. It's built to handle message history (the whole conversation), while text_completion is only for single, stateless prompts.

What models can I use for image generation? +

The server uses external diffusion models for this. You just need a detailed text description in the prompt provided to the generate_image tool; you don't specify the model ID.

How do I start training my own LLM? +

Use the create_finetune_job tool. You must provide a base model ID and point to your specific dataset file for it to begin.

If I have a massive dataset, how do I efficiently run `generate_embeddings`? +

You process them in batches. While the tool handles large arrays of strings, we recommend grouping texts into manageable chunks (e.g., 100-500 items) to prevent timeouts and optimize throughput. This method helps you monitor progress and ensures reliable data transfer for your vector database.

How do I check the status of a fine-tuning job after running `create_finetune_job`? +

You use the list_finetune_jobs tool to query all jobs. This returns a list that includes both active and completed runs, showing you the current state (e.g., 'PENDING', 'RUNNING', or 'FAILED') for easy monitoring.

Can `chat_completion` force the output into JSON format? +

Yes, you can guide the model to output structured data. When providing the prompt and message history, include specific instructions requesting a JSON schema. This ensures your AI client receives predictable, machine-readable results for reliable parsing.

What parameters should I control when using `generate_image`? +

Beyond the descriptive prompt, you can often specify dimensions or aspect ratios in the tool call. Checking the model's documentation will show supported size constraints (e.g., 1:1 square, 16:9 landscape) to get exactly the format your application requires.

Where do I obtain my Together AI API Key? +

Log in to the developer portal via api.together.xyz/settings/api-keys. If you do not have an existing key, click Create API Key. This token enables the execution of remote inferences spanning their hosted clusters securely.

Do I have to pay to use Together models through the agent? +

Yes. This connector simply routes your instructions to Together AI. Any tokens consumed during chat completion, embeddings, images generation, or fine-tuning workloads are billed directly to your registered Together AI account balance according to their official compute pricing models.

Can I access free models on Together AI? +

Yes! Together AI frequently offers free tiers for certain open-source models intended for experimentation and research. You can query these directly from your agent without depleting your account balance, though specific free-tier rate limits will apply.

Built & Managed by Vinkius 30s setup 7 tools

We've already built the connector for Together AI. Just plug in your AI agents and start using Vinkius.

No hosting. No infrastructure. No complex setup.
All 7 tools are live and waiting. You're up and running in seconds.

Vinkius runs on Claude Claude
Vinkius runs on ChatGPT ChatGPT
Vinkius runs on Cursor Cursor
Vinkius runs on Gemini Gemini
Vinkius runs on Windsurf Windsurf
Vinkius runs on VS Code VS Code
Vinkius runs on JetBrains JetBrains
Vinkius runs on Vercel Vercel
+ other MCP clients

Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.

Zero hosting required Full MCP catalog included Enterprise-grade security Auto-updated by Vinkius

Built, hosted, and secured by Vinkius. You just connect and go.