# Gradient AI MCP

> Gradient AI MCP lets you build production-grade LLM applications. It gives your agent access to foundational models, specialized NLP tools like sentiment analysis and entity extraction, and powerful methods for fine-tuning on your private datasets. You can generate high-dimensional embeddings, manage model versions, and establish Retrieval Augmented Generation (RAG) pipelines directly through your AI client.

## Overview
- **Category:** developer-tools
- **Price:** Free
- **Tags:** llm, fine-tuning, embeddings, nlp, ai-infrastructure

## Description

Think of this MCP as an entire MLOps stack that talks to your agent. Instead of just asking a large language model a question, you run a whole workflow. You feed it raw documents or audio, and the system handles all the prep work: transcribing files, extracting structured data points, and figuring out what's important enough to index for advanced search. If you’re building anything that needs to be accurate, grounded in specific corporate knowledge, or highly specialized (like diagnosing niche medical texts), this is your kit. It lets you manage model versions and train models on proprietary datasets so the AI doesn't just guess—it knows your business rules. When connecting through Vinkius, it means all these deep data operations are accessible to any MCP-compatible client, letting you build complex logic without writing boilerplate API calls.

## Tools

### analyze_sentiment
Determines the emotional tone (positive, negative, neutral) of a given document.

### answer_question
Retrieves and formats an answer to a specific question using content from a source document.

### complete_model
Generates natural language text based on a provided prompt, simulating model completion.

### generate_embeddings
Converts text inputs into numerical vectors used for advanced search and measuring content similarity.

### upload_file
Uploads source files, like PDFs or images, to be used by other analysis tools.

### create_model
Initializes and manages a new, custom fine-tuned AI model instance.

### create_rag_collection
Sets up a dedicated collection specifically for Retrieval Augmented Generation (RAG) operations.

### create_transcription
Starts the process of converting audio files into editable text transcriptions.

### delete_model
Removes a previously created fine-tuned model from your workspace.

### extract_entity
Pulls specific, structured data points (like names or dates) out of a document based on a defined schema.

### extract_pdf
Reads and pulls both text and key data from PDF files for further use.

### fine_tune_model
Trains an existing model using a set of provided samples to improve its performance on niche tasks.

### get_model
Retrieves detailed information about a specific, existing model instance.

### get_transcription
Fetches the finalized text result from an audio transcription job that was previously started.

### list_embeddings
Shows which models are available for generating vector embeddings.

### list_models
Displays a list of all foundational and custom fine-tuned models in your account.

### list_rag_collections
Lists all the dedicated RAG collections you have set up within the workspace.

### personalize_document
Modifies a document's tone and content to target a specific audience or persona.

### summarize_document
Creates a concise summary of long-form text documents while retaining key information.

## Prompt Examples

**Prompt:** 
```
List all the models available in my Gradient workspace.
```

**Response:** 
```
I've retrieved the models from your workspace. You have access to foundational models like 'llama3-8b' and your custom fine-tuned models such as 'customer-support-v1'.
```

**Prompt:** 
```
Analyze the sentiment of this text: 'The new API performance is incredible!'
```

**Response:** 
```
The sentiment analysis for that text is 'Positive' with a high confidence score. The language used indicates strong satisfaction.
```

**Prompt:** 
```
Generate a completion for 'Explain quantum computing' using model id 'base-llama3'.
```

**Response:** 
```
Using the 'base-llama3' model: 'Quantum computing is a type of computing that uses quantum-mechanical phenomena, such as superposition and entanglement...' [Full response follows]
```

## Capabilities

### Analyze Document Content
Extracts key information from PDFs and documents, runs sentiment checks, or answers specific questions based on the provided text.

### Build Custom Models
Trains foundational LLMs using your company's unique data so the model speaks in your brand's voice or follows internal protocols.

### Index Knowledge Bases (RAG)
Creates structured collections and embeddings from documents, allowing the agent to ground answers in a specific knowledge source rather than just general training data.

### Convert Text to Search Vectors
Generates high-dimensional vector representations (embeddings) of any text, enabling advanced search and similarity matching across huge datasets.

## Use Cases

### Customer Support Chatbot Build
A developer needs a chatbot that only answers questions based on the company's internal policy manuals. They use `upload_file` to ingest all PDFs, then call `create_rag_collection`. Finally, they let their agent ask questions using the collection, ensuring accurate, source-backed responses.

### Legal Document Review
A paralegal needs to review dozens of contracts. They use `extract_entity` repeatedly on each document to pull out all dates, client names, and contract values into a single structured spreadsheet for quick comparison.

### Market Research Analysis
A marketing team analyzes social media comments. They run `analyze_sentiment` on thousands of posts and then use `summarize_document` to quickly group the findings by topic, identifying both positive buzz and critical pain points.

### Historical Data Indexing
A researcher has old archives. They first transcribe audio records using `create_transcription`, then use `extract_entity` on the resulting text to pull out names, dates, and locations for a searchable database.

## Benefits

- When you need to understand user feelings, use `analyze_sentiment` to instantly gauge the tone of feedback or communications.
- Don't just ask for a summary; upload a PDF using `upload_file`, then use `extract_pdf` and `summarize_document` to get both text and key data points in one go.
- Stop losing context. By generating embeddings with `generate_embeddings`, your agent can find relevant information across millions of documents, even if the keywords don't match.
- Build specialized bots that speak your language. Use `fine_tune_model` to train a model on your specific documentation, making it an expert in your domain.
- Manage knowledge with `create_rag_collection`. This process keeps your answers grounded in verifiable sources, minimizing hallucinations and improving trust.

## How It Works

The bottom line is that your agent gains a dedicated MLOps pipeline, letting you move from raw documents to deployable AI features without leaving your chat interface.

1. Subscribe to this MCP and input your specific Gradient API Key and Workspace ID into your AI client.
2. Your agent gains access to the full suite of model management tools, allowing you to list foundational models or initiate a fine-tuning job.
3. You use the specialized functions—for example, running `extract_entity` on an uploaded file—and receive structured data outputs ready for application integration.

## Frequently Asked Questions

**How do I use the `analyze_sentiment` tool?**
You run it by providing the text or document you want checked. The tool returns a specific sentiment classification (positive, negative, neutral) and a confidence score for that rating.

**What is the difference between `summarize_document` and `answer_question`?**
`Summarize_document` creates an overview of everything in a file. `Answer_question` narrows the focus, giving you a direct answer to one specific query based on that same source document.

**How do I begin building with RAG using `create_rag_collection`?**
Start by uploading all your foundational documents. Then call `create_rag_collection`, which indexes those files, making them available for retrieval-augmented questioning.

**Can I use `extract_entity` on PDFs?**
Yes. You first need to run `extract_pdf` on the file to get the raw text and data out of the document format, which then feeds into `extract_entity` for structured parsing.

**Do I need to use `list_models` before running `fine_tune_model`?**
It's good practice. Use `list_models` first to confirm the foundational model ID you want to base your training on, ensuring you select the correct starting point.

**When I use `upload_file`, what file formats does it support for processing?**
It handles a wide variety of files, including PDFs, images, and raw documents. After the upload completes, you must pass the resulting unique file ID to another tool like `extract_entity` or `answer_question` so it knows which data source to reference.

**How does using `create_model` affect my API usage quotas?**
Creating a model reserves the infrastructure and associated weights for your custom instance. The act of creation itself doesn't consume run-time quota, but subsequent calls to that model will count toward your usage limits.

**If I no longer need an instance, how does the `delete_model` tool work?**
The `delete_model` tool permanently removes the fine-tuned model and its weights from your workspace. Use this when you are sure the model is obsolete; running it is irreversible.

**How can I start training a custom model with my own data?**
You can use the `fine_tune_model` tool. Simply provide the model ID and an array of training samples. The agent will handle the submission to Gradient's training infrastructure.

**Can I use RAG (Retrieval Augmented Generation) with this server?**
Yes! The `complete_model` tool includes an optional `rag` parameter, allowing you to provide context or collection IDs to ground the model's responses in specific data.

**How do I generate vector embeddings for my documents?**
Use the `generate_embeddings` tool by specifying a model slug (like 'bge-large') and a list of text inputs. It will return the high-dimensional vectors for your text.