MCP Server

Gradient AI MCP for AI. Train custom models and process complex documents.

Q: What is the difference between summarizedocument and answerquestion?

Summarizedocument creates an overview of everything in a file. Answerquestion narrows the focus, giving you a direct answer to one specific query based on that same source document.

Q: Can I use extractentity on PDFs?

Yes. You first need to run extractpdf on the file to get the raw text and data out of the document format, which then feeds into extractentity for structured parsing.

Q: Do I need to use listmodels before running finetunemodel?

It's good practice. Use listmodels first to confirm the foundational model ID you want to base your training on, ensuring you select the correct starting point.

Q: If I no longer need an instance, how does the deletemodel tool work?

The deletemodel tool permanently removes the fine-tuned model and its weights from your workspace. Use this when you are sure the model is obsolete; running it is irreversible.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

How this MCP server connects to your AI agent

Gradient AI MCP lets you build production-grade LLM applications. It gives your agent access to foundational models, specialized NLP tools like sentiment analysis and entity extraction, and powerful methods for fine-tuning on your private datasets.

You can generate high-dimensional embeddings, manage model versions, and establish Retrieval Augmented Generation (RAG) pipelines directly through your AI client.

What AI agents can do with Gradient AI (LLM API & Finetuning) Automation

Analyze sentiment

Determines the emotional tone (positive, negative, neutral) of a given document.

Answer question

Retrieves and formats an answer to a specific question using content from a source document.

Complete model

Generates natural language text based on a provided prompt, simulating model completion.

+ 16 more capabilities included

Analyze Document Content

Extracts key information from PDFs and documents, runs sentiment checks, or answers specific questions based on the provided text.

Build Custom Models

Trains foundational LLMs using your company's unique data so the model speaks in your brand's voice or follows internal protocols.

Index Knowledge Bases (RAG)

Creates structured collections and embeddings from documents, allowing the agent to ground answers in a specific knowledge source rather than just general training data.

Convert Text to Search Vectors

Generates high-dimensional vector representations (embeddings) of any text, enabling advanced search and similarity matching across huge datasets.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

What AI agents can do with Gradient AI (LLM API & Finetuning) - 19 Tools

This set of specialized tools lets you handle the entire data lifecycle: from ingesting raw files to generating highly accurate, structured model outputs.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Gradient AI (LLM API & Finetuning) on Vinkius

Analyze Sentiment

Determines the emotional tone (positive, negative, neutral) of a given document.

Answer Question

Retrieves and formats an answer to a specific question using content from a source...

Complete Model

Generates natural language text based on a provided prompt, simulating model...

Generate Embeddings

Converts text inputs into numerical vectors used for advanced search and measuring...

Upload File

Uploads source files, like PDFs or images, to be used by other analysis tools.

Create Model

Initializes and manages a new, custom fine-tuned AI model instance.

Create Rag Collection

Sets up a dedicated collection specifically for Retrieval Augmented Generation (RAG) operations.

Create Transcription

Starts the process of converting audio files into editable text transcriptions.

Delete Model

Removes a previously created fine-tuned model from your workspace.

Extract Entity

Pulls specific, structured data points (like names or dates) out of a document based...

Extract Pdf

Reads and pulls both text and key data from PDF files for further use.

Fine Tune Model

Trains an existing model using a set of provided samples to improve its performance on niche tasks.

Get Model

Retrieves detailed information about a specific, existing model instance.

Get Transcription

Fetches the finalized text result from an audio transcription job that was...

List Embeddings

Shows which models are available for generating vector embeddings.

List Models

Displays a list of all foundational and custom fine-tuned models in your account.

List Rag Collections

Lists all the dedicated RAG collections you have set up within the workspace.

Personalize Document

Modifies a document's tone and content to target a specific audience or persona.

Summarize Document

Creates a concise summary of long-form text documents while retaining key information.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Gradient AI integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "gradient-ai-llm-api-finetuning": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Gradient AI tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"gradient-ai-llm-api-finetuning": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Gradient AI (LLM API & Finetuning), then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Built on the Model Context Protocol (MCP) for Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 19 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Manual data preparation kills momentum.

Think about it: You get a client PDF. First, you have to download it and open it in Acrobat. Then, you copy the text into Notion, paste it into an analysis tool, and manually highlight sections you want analyzed. If it's audio, you record it, then use another service just to transcribe the speech before you can even start summarizing.

With this MCP, your agent handles all that friction. You upload the file once, and the system automatically prepares everything—it extracts the text, finds key data points with `extract_entity`, and gives you a clean summary without any copy-pasting or switching tabs. The result is immediate, structured output.

Structured knowledge retrieval via embeddings

Before this MCP, finding related information meant keyword matching—a simple search that only worked if the user remembered the exact right word. If the document used synonyms or was poorly indexed, you failed.

Now, by running `generate_embeddings`, your system converts text into mathematical vectors. This means it finds documents based on *meaning* and *concept*, not just keywords. It's a massive difference for search accuracy.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

llm

fine-tuning

embeddings

nlp

ai-infrastructure

What your AI can actually do with this

Think of this MCP as an entire MLOps stack that talks to your agent. Instead of just asking a large language model a question, you run a whole workflow. You feed it raw documents or audio, and the system handles all the prep work: transcribing files, extracting structured data points, and figuring out what's important enough to index for advanced search.

If you’re building anything that needs to be accurate, grounded in specific corporate knowledge, or highly specialized (like diagnosing niche medical texts), this is your kit. It lets you manage model versions and train models on proprietary datasets so the AI doesn't just guess—it knows your business rules. When connecting through Vinkius, it means all these deep data operations are accessible to any MCP-compatible client, letting you build complex logic without writing boilerplate API calls.

Built · Hosted · Managed by Vinkius Gradient AI MCP - Fine-tune models & generate embeddings

Server ID 019e5d21-f4bb-72b3-a1a2-62ab4fb1276d

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Here's how it actually works

The bottom line is that your agent gains a dedicated MLOps pipeline, letting you move from raw documents to deployable AI features without leaving your chat interface.

Subscribe to this MCP and input your specific Gradient API Key and Workspace ID into your AI client.

Your agent gains access to the full suite of model management tools, allowing you to list foundational models or initiate a fine-tuning job.

You use the specialized functions—for example, running extract_entity on an uploaded file—and receive structured data outputs ready for application integration.

Who is this actually for?

This stack is built for technical teams who deal with proprietary data and require highly accurate, specialized AI. It's for the Data Scientists running complex research pipelines or the ML Engineers building production-grade LLM apps.

AI Engineer

They use this MCP to quickly iterate on fine-tuning experiments and test model completions against new data sources.

Data Scientist

They generate embeddings and perform NLP analysis—like running analyze_sentiment—without needing complex local server setups or multiple APIs.

Software Developer

They integrate advanced LLM capabilities into applications, using tools like create_rag_collection to ground answers in user-uploaded documents.

What Changes When You Connect

When you need to understand user feelings, use analyze_sentiment to instantly gauge the tone of feedback or communications.

Don't just ask for a summary; upload a PDF using upload_file, then use extract_pdf and summarize_document to get both text and key data points in one go.

Stop losing context. By generating embeddings with generate_embeddings, your agent can find relevant information across millions of documents, even if the keywords don't match.

Build specialized bots that speak your language. Use fine_tune_model to train a model on your specific documentation, making it an expert in your domain.

Manage knowledge with create_rag_collection. This process keeps your answers grounded in verifiable sources, minimizing hallucinations and improving trust.

See it in action

01 01

Customer Support Chatbot Build

A developer needs a chatbot that only answers questions based on the company's internal policy manuals. They use upload_file to ingest all PDFs, then call create_rag_collection. Finally, they let their agent ask questions using the collection, ensuring accurate, source-backed responses.

02 02

Legal Document Review

A paralegal needs to review dozens of contracts. They use extract_entity repeatedly on each document to pull out all dates, client names, and contract values into a single structured spreadsheet for quick comparison.

03 03

Market Research Analysis

A marketing team analyzes social media comments. They run analyze_sentiment on thousands of posts and then use summarize_document to quickly group the findings by topic, identifying both positive buzz and critical pain points.

04 04

Historical Data Indexing

A researcher has old archives. They first transcribe audio records using create_transcription, then use extract_entity on the resulting text to pull out names, dates, and locations for a searchable database.

The honest tradeoffs

Asking the model directly about proprietary data

Anti-pattern

The user simply pastes a chunk of internal code into the chat and asks, 'What does this mean?' The LLM answers using general knowledge, ignoring company-specific context.

The Fix

Instead, first upload the file using upload_file, then use create_rag_collection to index it. Finally, ask your agent questions that reference the collection; the answer will be grounded in the document.

Treating embeddings like answers

Anti-pattern

The developer assumes running generate_embeddings is enough. They see a list of numbers and think they have actionable insights, but the output is meaningless to an end-user.

The Fix

Use generate_embeddings to find similar documents, then pass those retrieved documents into a tool like answer_question. The embeddings help find the data; the tools use it.

Forgetting to structure input files

Anti-pattern

Trying to run complex analysis on raw, un-uploaded text that might contain mixed formats (images, tables, text). The tool will fail or only process simple strings.

The Fix

Always start by using upload_file to ingest the source material. If it's a PDF, use extract_pdf; if it's audio, use create_transcription before any other analysis.

When It Fits, When It Doesn't

Use this MCP if your AI project requires more than just talking to an LLM; it needs data plumbing. If you are building a system that must interpret PDFs, analyze sentiment at scale, or answer questions based on documents the model has never seen before, this is for you. You absolutely need generate_embeddings and RAG capabilities (create_rag_collection). Don't use this if your goal is simple text generation—just running complete_model might be enough. If all you need is to chat with a large language model about general topics, you don't need the complexity of fine-tuning or document extraction.

Questions you might have

How do I use the `analyze_sentiment` tool? +

You run it by providing the text or document you want checked. The tool returns a specific sentiment classification (positive, negative, neutral) and a confidence score for that rating.

What is the difference between `summarize_document` and `answer_question`? +

Summarize_document creates an overview of everything in a file. Answer_question narrows the focus, giving you a direct answer to one specific query based on that same source document.

How do I begin building with RAG using `create_rag_collection`? +

Start by uploading all your foundational documents. Then call create_rag_collection, which indexes those files, making them available for retrieval-augmented questioning.

Can I use `extract_entity` on PDFs? +

Yes. You first need to run extract_pdf on the file to get the raw text and data out of the document format, which then feeds into extract_entity for structured parsing.

Do I need to use `list_models` before running `fine_tune_model`? +

It's good practice. Use list_models first to confirm the foundational model ID you want to base your training on, ensuring you select the correct starting point.

When I use `upload_file`, what file formats does it support for processing? +

It handles a wide variety of files, including PDFs, images, and raw documents. After the upload completes, you must pass the resulting unique file ID to another tool like extract_entity or answer_question so it knows which data source to reference.

How does using `create_model` affect my API usage quotas? +

Creating a model reserves the infrastructure and associated weights for your custom instance. The act of creation itself doesn't consume run-time quota, but subsequent calls to that model will count toward your usage limits.

If I no longer need an instance, how does the `delete_model` tool work? +

The delete_model tool permanently removes the fine-tuned model and its weights from your workspace. Use this when you are sure the model is obsolete; running it is irreversible.

How can I start training a custom model with my own data? +

You can use the fine_tune_model tool. Simply provide the model ID and an array of training samples. The agent will handle the submission to Gradient's training infrastructure.

Can I use RAG (Retrieval Augmented Generation) with this server? +

Yes! The complete_model tool includes an optional rag parameter, allowing you to provide context or collection IDs to ground the model's responses in specific data.

How do I generate vector embeddings for my documents? +

Use the generate_embeddings tool by specifying a model slug (like 'bge-large') and a list of text inputs. It will return the high-dimensional vectors for your text.

How this MCP server connects to your AI agent

What AI agents can do with Gradient AI (LLM API & Finetuning) Automation

Analyze sentiment

Answer question

Complete model

What AI agents can do with Gradient AI (LLM API & Finetuning) - 19 Tools

Make your AI actually useful.

Analyze Sentiment

Determines the emotional tone (positive, negative, neutral) of a given document.

Answer Question

Retrieves and formats an answer to a specific question using content from a source...

Complete Model

Generates natural language text based on a provided prompt, simulating model...

Generate Embeddings

Converts text inputs into numerical vectors used for advanced search and measuring...

Upload File

Uploads source files, like PDFs or images, to be used by other analysis tools.

Create Model

Initializes and manages a new, custom fine-tuned AI model instance.

Create Rag Collection

Sets up a dedicated collection specifically for Retrieval Augmented Generation (RAG) operations.

Create Transcription

Starts the process of converting audio files into editable text transcriptions.

Delete Model

Removes a previously created fine-tuned model from your workspace.

Extract Entity

Pulls specific, structured data points (like names or dates) out of a document based...

Extract Pdf

Reads and pulls both text and key data from PDF files for further use.

Fine Tune Model

Trains an existing model using a set of provided samples to improve its performance on niche tasks.

Get Model

Retrieves detailed information about a specific, existing model instance.

Get Transcription

Fetches the finalized text result from an audio transcription job that was...

List Embeddings

Shows which models are available for generating vector embeddings.

List Models

Displays a list of all foundational and custom fine-tuned models in your account.

List Rag Collections

Lists all the dedicated RAG collections you have set up within the workspace.

Personalize Document

Modifies a document's tone and content to target a specific audience or persona.

Summarize Document

Creates a concise summary of long-form text documents while retaining key information.

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More