# Cohere (Embed & Rerank) MCP MCP

> Cohere provides advanced NLP tools for building enterprise AI systems. Generate dense vector embeddings to power semantic search, rerank documents against specific queries for better knowledge retrieval (RAG), and perform precise text classification directly from your agent.

## Overview
- **Category:** ai-frontier
- **Price:** Free
- **Tags:** embeddings, semantic-search, vector-representation, natural-language-processing, rag, text-analysis

## Description

Need an AI that actually understands context? This MCP lets you move beyond basic keyword searching. It generates the deep mathematical representations—the vectors—of any piece of writing, allowing your agent to understand *what* a document means, not just what words it contains. You can then take those embeddings and run them through a reranking process; this structures chunks of data by priority, ensuring the most relevant information is always presented first. This makes building reliable knowledge systems much easier. When you connect Cohere via Vinkius, your agent gains powerful abilities like categorizing inputs or running complex conversational transformations without needing custom backend code. It’s pure control over the AI pipeline.

## Tools

### chat_completion
Execute specific conversational sequences defined by your workflow.

### classify_texts
Assign predefined labels to text inputs and evaluate their confidence scores.

### embed_texts
Generate dense vector representations for plain strings, mapping semantic meaning.

### list_models
List available Cohere models and their hashes to verify API availability based on your current plan.

### rerank_documents
Structure document chunks by prioritizing them against a specific query for better context retrieval.

### tokenize_text
Break down text into its exact structural segments, useful for auditing token counts and model limits.

## Prompt Examples

**Prompt:** 
```
Generate embeddings for these texts: ['Hello world', 'Artificial Intelligence']
```

**Response:** 
```
Embeddings generated! I've retrieved the dense vector representations for both strings. You can now use these floats for semantic search or similarity calculations.
```

**Prompt:** 
```
Rerank these documents for query 'Best pizza in NY': ['Pizza hut review', 'Joe's Pizza is the local favorite']
```

**Response:** 
```
Reranking complete! 'Joe's Pizza is the local favorite' has been moved to rank 0 with a high relevance score. 'Pizza hut review' is now at rank 1.
```

**Prompt:** 
```
How many tokens are in the text: 'The quick brown fox jumps over the lazy dog'?
```

**Response:** 
```
That sentence contains 9 tokens according to the Cohere tokenizer. I can provide the exact integer array mapping these tokens if you'd like.
```

## Capabilities

### Generate vector representations
It converts plain strings into dense vector shapes that quantify the meaning of the text for advanced search.

### Improve document relevance
You can structure and reorder retrieved documents based on how closely they match a specific question, improving RAG accuracy.

### Categorize inputs automatically
The agent reads text and assigns it to predefined labels while giving you a confidence score for the prediction.

### Manage conversations
It handles formatted conversational turns, allowing your agent to maintain state and follow multi-step instructions.

## Use Cases

### A support bot can't tell if the user is asking about billing or technical issues.
The agent receives a vague message. Instead of failing, it calls `classify_texts` first, which immediately categorizes the input as 'Billing Inquiry'. The system then routes the chat to the correct department.

### A document search engine returns 20 results, but only 3 are useful.
The agent runs all 20 documents through `rerank_documents` using the user’s query. The system then presents the top 3 ranked chunks, cutting down noise and delivering instant value.

### A developer needs to ensure their prompt won't exceed token limits.
Before sending a complex request, they call `tokenize_text` on the entire input string. This confirms the exact token count, preventing unexpected API failures and saving costs.

### An internal tool needs to process user-uploaded documents for compliance.
The agent uses `embed_texts` to create a vector fingerprint of every document. It can then compare these fingerprints against known sensitive data vectors, flagging non-compliant files.

## Benefits

- Build smarter search: Instead of relying on keyword matches, use `embed_texts` to find documents that are conceptually related to the query. This is a massive step up from traditional databases.
- Improve knowledge accuracy: When retrieving data for an answer, run it through `rerank_documents`. This ensures your agent reads the most relevant context first, making its responses more trustworthy.
- Automate categorization: Use `classify_texts` to automatically tag incoming user requests or documents. Your agent can route a request instantly based on whether it's billing-related, support-related, etc.
- Audit token usage: Need to know if your prompt is too long? Run text through `tokenize_text`. This gives you the exact structural breakdown of tokens before hitting API limits.
- Build conversational memory: The `chat_completion` tool lets your agent handle complex, multi-turn conversations by maintaining state and following detailed instructions.

## How It Works

The bottom line is, you send a natural language instruction and get back structured, actionable data ready for your application logic.

1. Subscribe to the MCP and enter your Cohere API key (either a trial or production key from your account dashboard).
2. Your AI client sends the request—for example, asking it to find embeddings for several documents.
3. The service returns the requested data, whether that’s a list of model hashes, categorized text labels, or dense vector arrays.

## Frequently Asked Questions

**Can my agent improve my RAG system's accuracy using Cohere?**
Yes. The 'rerank_documents' tool is specifically designed for this. Provide a query and a list of documents, and Cohere will reorder them based on semantic relevance, ensuring the most accurate context is fed to your LLM.

**How do I test text classification via the agent?**
Use the 'classify_texts' tool. Provide your input strings and a few-shot JSON array of examples (text and label). The agent will return the predicted categories along with confidence scores from the Cohere engine.

**What is the difference between Trial and Production keys?**
Trial keys are free for development but have strict rate limits (approx. 1,000 calls per month). Production keys remove these limits but require a paid plan. Both types work seamlessly with this server.

**How do I process a large batch of texts using the `embed_texts` tool?**
You pass an array of strings to the MCP. It handles efficient batching so you don't hit rate limits. You just send all your source documents in one call for dense vector generation.

**What detailed information does the `tokenize_text` tool provide besides a simple token count?**
It provides the exact structural segmentation of the context. You get an integer array that maps every single token, which is critical for debugging model inputs and controlling context limits.

**How can I verify which Cohere models are available using `list_models`?**
Use the `list_models` tool. This inspects your account's internal properties to confirm exactly which Cohere models and hashes you have access to, based on your current API plan.

**If my initial documents are disorganized, can I use `rerank_documents` to fix the context?**
Yes, that's its main function. You feed it a set of documents and a specific query; the MCP structures them by priority, giving you an optimized order for your RAG pipeline.

**Is my API key stored securely when I connect this MCP to my agent?**
Yes. The Vinkius platform manages the connection and handles the keys using industry-standard encryption protocols. You never need to expose your raw key within your conversation flow.