Voyage AI MCP for AI. Search by meaning, not just keywords.

Q: How do I handle massive volumes of documents with Voyage AI (AI Embeddings API)?

You use the batch tools. First, uploadfile to stage your data, then call createbatch. You can monitor progress and check status using getbatch until the job is complete.

Q: Can this MCP handle images and text together?

Yes. Use createmultimodalembeddings to generate a single vector space that represents both visual data (images) and descriptive text, making them searchable as one unit.

Q: When I need to process a large dataset, what is the proper workflow for using uploadfile?

You must use uploadfile first. This action puts the data into the system's queue, making it available for subsequent batch operations like creating embeddings.

Q: If my embedding job fails or stalls, how do I check its status using getbatch?

getbatch retrieves the current state of a specific batch job. You can use this to confirm if it's running, finished successfully, or if an error occurred.

Q: How do I manage my data retention and clean up temporary assets using deletefile?

deletefile permanently removes a file from the system. This is crucial for maintaining compliance and keeping your workspace organized after job completion.

Q: Before running any batch operation, how do I see what files are already stored by using listfiles?

listfiles retrieves a comprehensive list of every file in the system. This lets you check metadata and confirm your starting data sources before processing.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

How this MCP server connects to your AI agent

Voyage AI Embeddings API handles complex data vectorization, letting your agent search by meaning, not just keywords. It generates high-fidelity embeddings for text, code, and images, while also running smart reranking jobs to ensure your retrieval results are surgically precise.

What AI agents can do with Voyage AI (AI Embeddings API) Automation

Cancel batch

Stops a batch inference job before it finishes running.

Create batch

Starts a large-scale, asynchronous data processing job.

Create contextualized embeddings

Generates vector embeddings that retain the meaning of their surrounding document context.

+ 10 more capabilities included

Vectorize Text

Converts large bodies of text or code into mathematical vectors for semantic search.

Handle Multimodal Content

Creates single, unified vectors from mixed input like images and surrounding text.

Process Data in Batches

Manages large-scale data ingestion by submitting and monitoring asynchronous jobs.

Improve Search Relevance

Takes initial search results and scores them, boosting the most relevant documents to the top for your agent.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

What AI agents can do with Voyage AI (AI Embeddings API) - 13 Tools

These tools let you manage the entire data lifecycle: uploading files, generating various types of embeddings, running large-scale batches, and refining search results via reranking.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Voyage AI (AI Embeddings API) on Vinkius

Cancel Batch

Stops a batch inference job before it finishes running.

Create Batch

Starts a large-scale, asynchronous data processing job.

Create Contextualized Embeddings

Generates vector embeddings that retain the meaning of their surrounding document...

Create Embeddings

Creates standard numerical vectors for pure text input.

Create Multimodal Embeddings

Generates single vectors from mixed content, like images paired with descriptions.

Delete File

Removes a file that was previously uploaded to the system.

Get Batch

Checks the current status and progress of an existing batch job.

Get File Content

Downloads the actual binary or text content of a specific file.

Get File

Retrieves general metadata about a stored file.

List Batches

Shows an overview of all previously created and running batch jobs.

List Files

Lists all files currently stored in the system's repository.

Rerank

Scores multiple documents against a given query to find the most relevant context.

Upload File

Uploads a file specifically for use in an asynchronous batch job.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Voyage AI integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "voyage-ai-ai-embeddings-api": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Voyage AI tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"voyage-ai-ai-embeddings-api": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Voyage AI (AI Embeddings API), then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Voyage AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Built on the Model Context Protocol (MCP) for Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 13 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

The current search experience feels like digging through a landfill., Solved with Vinkius AI Gateway

Today, if your agent can't find the answer immediately, it usually means the initial retrieval step was flawed. You spend time uploading documents and running basic searches only to get vague results—a mix of relevant and irrelevant noise. Then you have to manually sift through dozens of pages just to pull out one key quote or concept.

With this MCP, the process is smarter. You upload your data, but when you ask a question, the system doesn't just send it to the database; it runs the query against everything and uses advanced scoring techniques to surface only the absolute best context first. Your agent gets an immediate answer, not a folder full of potential answers.

Contextualized Embeddings: Giving your data deep memory

The biggest step away from old systems is how it handles document boundaries. Instead of treating every paragraph as a standalone unit, the system preserves the relationship between chunks. When you use `create_contextualized_embeddings`, that surrounding context gets baked into the vector itself.

That change means your agent's understanding is deeper. It knows *why* a piece of data is relevant, not just *that* it exists. The results are accurate and reliable.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

embeddings

rag

rerank

vector-search

multimodal-ai

What your AI can actually do with this

You need to make sure that when a user asks a question, the system doesn't just match words; it understands the intent behind them. This MCP gives your agent the tools to do that using advanced vectorization and search refinement. Instead of relying on simple keyword matches, you feed complex documents into this service, which converts them into high-dimensional vectors—numerical representations that capture context.

If your workflow needs to process millions of records or handle mixed content (like a document with graphs), the batch functions make it scalable. The real power comes when you combine this MCP’s search capabilities with other services; for instance, you can chain this with a messaging MCP and have your agent automatically send a summary of the findings right after retrieval.

This entire process runs securely on Vinkius, guaranteeing that every data flow is fully visible through its AI Analytics dashboard.

Built · Hosted · Managed by Vinkius Voyage AI Embeddings API - Vectorize & Rerank Search Data

Server ID 019e5d66-7968-733e-80cc-1823274472ac

Vinkius Inspector

Compliance Grade A+

Score 98.33/100

Report View Report ↗

Here's how it actually works

The bottom line is that you manage data lifecycle—from raw file upload to final scored result—all through a sequence of structured API calls.

First, upload raw data or file metadata using upload_file to prepare it for processing.

Next, decide on the embedding type—you might call create_contextualized_embeddings if you need document context, or use create_multimodal_embeddings for mixed media.

Finally, when retrieving information, run rerank against your query to score and prioritize the top results before passing them back to the agent.

What Changes When You Connect

Better search results: Use rerank to score documents and ensure your agent only sees the highest-relevance context for its answer. This drastically improves accuracy over basic vector lookups.

Handle massive data loads: If you have millions of records, don't process them synchronously. Use create_batch to queue jobs, then check status with get_batch, keeping your agent responsive while the background work completes.

Context-aware embeddings: Forget simple text vectors. create_contextualized_embeddings embeds chunks while preserving their relationship to the full source document, cutting down on retrieval errors.

Mixed media support: Need to search a manual that contains both text and diagrams? create_multimodal_embeddings combines those sources into one searchable vector space.

Full visibility: You can track every step of this process—from initial file upload with upload_file to the final scoring—through Vinkius AI Analytics, so nothing happens in the dark.

See it in action

01 01

Technical Manual Search

An engineer needs to find a specific fix across 10 years of product manuals. They upload all PDFs using upload_file, then run create_contextualized_embeddings. When the user asks about 'error code X', the agent uses rerank on the top results to pinpoint the exact paragraph, skipping irrelevant sections.

02 02

Legal Document Review

A paralegal must review thousands of contracts for mentions of a specific clause. Instead of running 100 separate searches, they use create_batch to process all documents at once. They then analyze the results to find every instance of the key phrase.

03 03

Product Catalog Search

A user wants to search for a product based on an image and a description. The agent uses create_multimodal_embeddings on both inputs, allowing it to match visual intent with textual queries simultaneously.

04 04

Codebase Q&A

A developer asks a question about legacy code written in an old language. They use the embedding tools to vectorize the codebase documentation and then retrieve contextually relevant snippets, allowing their agent to answer with high accuracy.

The honest tradeoffs

Processing data directly

Anti-pattern

Trying to manually pass gigabytes of raw text into a single API call because it's quicker than setting up the job.

The Fix

You must use upload_file first, and then trigger processing via create_batch. This handles the scale reliably without timing out.

Ignoring context

Anti-pattern

Using standard embeddings when your document is highly technical. The model treats all chunks independently, losing critical relationships between paragraphs.

The Fix

Always use create_contextualized_embeddings for domain-specific text to ensure the vector understands its place within the larger source document.

Stopping at first search

Anti-pattern

Relying only on initial embeddings and presenting everything found, even if some results are vague or off-topic.

The Fix

Always finish your pipeline with rerank. This cross-encoder step filters out the noise and guarantees that the user sees the absolute best matches first.

When It Fits, When It Doesn't

Use this MCP if your search application needs to understand meaning, context, or mixed media types. If you're just matching keywords in a small set of documents, a basic database lookup is fine. But if you're building an advanced knowledge retrieval system that processes large volumes of data (batching), handles complex inputs (multimodal), and demands high precision (reranking), this is your toolset. Don’t try to use simple create_embeddings when you need contextual accuracy; those are for basic text chunks only. If the problem is merely connecting two services, remember that Vinkius lets you chain multiple MCPs together, giving you a single point of access across many platforms.

Questions you might have

How do I handle massive volumes of documents with Voyage AI (AI Embeddings API)? +

You use the batch tools. First, upload_file to stage your data, then call create_batch. You can monitor progress and check status using get_batch until the job is complete.

What's the difference between `create_embeddings` and `create_contextualized_embeddings`? +

Simple embeddings treat text in isolation. Contextualized embeddings use surrounding document information to create a more accurate vector, which is critical for complex documents.

When should I use the `rerank` tool? +

Always use it before passing data to the final LLM call. It scores your initial search results against the user's query, guaranteeing you pass the most relevant context possible.

Can this MCP handle images and text together? +

Yes. Use create_multimodal_embeddings to generate a single vector space that represents both visual data (images) and descriptive text, making them searchable as one unit.

When I need to process a large dataset, what is the proper workflow for using `upload_file`? +

You must use upload_file first. This action puts the data into the system's queue, making it available for subsequent batch operations like creating embeddings.

If my embedding job fails or stalls, how do I check its status using `get_batch`? +

get_batch retrieves the current state of a specific batch job. You can use this to confirm if it's running, finished successfully, or if an error occurred.

How do I manage my data retention and clean up temporary assets using `delete_file`? +

delete_file permanently removes a file from the system. This is crucial for maintaining compliance and keeping your workspace organized after job completion.

Before running any batch operation, how do I see what files are already stored by using `list_files`? +

list_files retrieves a comprehensive list of every file in the system. This lets you check metadata and confirm your starting data sources before processing.

How does reranking improve my RAG system's accuracy? +

By using the rerank tool, your agent can take a list of potentially relevant documents and re-score them using a powerful cross-encoder model. This ensures that the most semantically relevant pieces of information are ranked first, providing better context for the LLM to answer queries.

What is the benefit of using contextualized embeddings? +

The create_contextualized_embeddings tool allows you to embed chunks of text while considering the surrounding content of the same document. This prevents loss of meaning that often happens with standard chunking, leading to much higher retrieval precision.

Can I process images and text in the same vector space? +

Yes! With create_multimodal_embeddings, you can provide interleaved sequences of text and image URLs. Voyage AI will generate a single embedding that represents the combined semantic meaning, perfect for visual or hybrid search.

How this MCP server connects to your AI agent

Voyage AI Embeddings API handles complex data vectorization, letting your agent search by meaning, not just keywords. It generates high-fidelity embeddings for text, code, and images, while also running smart reranking jobs to ensure your retrieval results are surgically precise.

What AI agents can do with Voyage AI (AI Embeddings API) Automation

Cancel batch

Create batch

Create contextualized embeddings

What AI agents can do with Voyage AI (AI Embeddings API) - 13 Tools

Cancel Batch

Stops a batch inference job before it finishes running.

Create Batch

Starts a large-scale, asynchronous data processing job.

Create Contextualized Embeddings

Generates vector embeddings that retain the meaning of their surrounding document...

Create Embeddings

Creates standard numerical vectors for pure text input.

Create Multimodal Embeddings

Generates single vectors from mixed content, like images paired with descriptions.

Delete File

Removes a file that was previously uploaded to the system.

Get Batch

Checks the current status and progress of an existing batch job.

Get File Content

Downloads the actual binary or text content of a specific file.

Get File

Retrieves general metadata about a stored file.

List Batches

Shows an overview of all previously created and running batch jobs.

List Files

Lists all files currently stored in the system's repository.

Rerank

Scores multiple documents against a given query to find the most relevant context.

Upload File

Uploads a file specifically for use in an asynchronous batch job.

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Built on the Model Context Protocol (MCP) for Claude, ChatGPT, Cursor, and more

The current search experience feels like digging through a landfill., Solved with Vinkius AI Gateway

Contextualized Embeddings: Giving your data deep memory

embeddings

rag

rerank

vector-search

multimodal-ai

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect