Voyage AI MCP for AI. Search by meaning, not just keywords.
Works with every AI agent you already use
…and any MCP-compatible client








How this MCP server connects to your AI agent
Voyage AI Embeddings API handles complex data vectorization, letting your agent search by meaning, not just keywords. It generates high-fidelity embeddings for text, code, and images, while also running smart reranking jobs to ensure your retrieval results are surgically precise.
What AI agents can do with Voyage AI (AI Embeddings API) Automation
Cancel batch
Stops a batch inference job before it finishes running.
Create batch
Starts a large-scale, asynchronous data processing job.
Create contextualized embeddings
Generates vector embeddings that retain the meaning of their surrounding document context.
Converts large bodies of text or code into mathematical vectors for semantic search.
Creates single, unified vectors from mixed input like images and surrounding text.
Manages large-scale data ingestion by submitting and monitoring asynchronous jobs.
Takes initial search results and scores them, boosting the most relevant documents to the top for your agent.
Ask an AI about this
Waiting for input…
What AI agents can do with Voyage AI (AI Embeddings API) - 13 Tools
These tools let you manage the entire data lifecycle: uploading files, generating various types of embeddings, running large-scale batches, and refining search results via reranking.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using Voyage AI (AI Embeddings API) on VinkiusCancel Batch
Stops a batch inference job before it finishes running.
Create Batch
Starts a large-scale, asynchronous data processing job.
Create Contextualized Embeddings
Generates vector embeddings that retain the meaning of their surrounding document...
Create Embeddings
Creates standard numerical vectors for pure text input.
Create Multimodal Embeddings
Generates single vectors from mixed content, like images paired with descriptions.
Delete File
Removes a file that was previously uploaded to the system.
Get Batch
Checks the current status and progress of an existing batch job.
Get File Content
Downloads the actual binary or text content of a specific file.
Get File
Retrieves general metadata about a stored file.
List Batches
Shows an overview of all previously created and running batch jobs.
List Files
Lists all files currently stored in the system's repository.
Rerank
Scores multiple documents against a given query to find the most relevant context.
Upload File
Uploads a file specifically for use in an asynchronous batch job.
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Voyage AI (AI Embeddings API), then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,100+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Voyage AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Built on the Model Context Protocol (MCP) for Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This connection provides 13 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.
The current search experience feels like digging through a landfill., Solved with Vinkius AI Gateway
Today, if your agent can't find the answer immediately, it usually means the initial retrieval step was flawed. You spend time uploading documents and running basic searches only to get vague results—a mix of relevant and irrelevant noise. Then you have to manually sift through dozens of pages just to pull out one key quote or concept.
With this MCP, the process is smarter. You upload your data, but when you ask a question, the system doesn't just send it to the database; it runs the query against everything and uses advanced scoring techniques to surface only the absolute best context first. Your agent gets an immediate answer, not a folder full of potential answers.
Contextualized Embeddings: Giving your data deep memory
The biggest step away from old systems is how it handles document boundaries. Instead of treating every paragraph as a standalone unit, the system preserves the relationship between chunks. When you use `create_contextualized_embeddings`, that surrounding context gets baked into the vector itself.
That change means your agent's understanding is deeper. It knows *why* a piece of data is relevant, not just *that* it exists. The results are accurate and reliable.
What your AI can actually do with this
You need to make sure that when a user asks a question, the system doesn't just match words; it understands the intent behind them. This MCP gives your agent the tools to do that using advanced vectorization and search refinement. Instead of relying on simple keyword matches, you feed complex documents into this service, which converts them into high-dimensional vectors—numerical representations that capture context.
If your workflow needs to process millions of records or handle mixed content (like a document with graphs), the batch functions make it scalable. The real power comes when you combine this MCP’s search capabilities with other services; for instance, you can chain this with a messaging MCP and have your agent automatically send a summary of the findings right after retrieval.
This entire process runs securely on Vinkius, guaranteeing that every data flow is fully visible through its AI Analytics dashboard.
019e5d66-7968-733e-80cc-1823274472ac Here's how it actually works
The bottom line is that you manage data lifecycle—from raw file upload to final scored result—all through a sequence of structured API calls.
First, upload raw data or file metadata using upload_file to prepare it for processing.
Next, decide on the embedding type—you might call create_contextualized_embeddings if you need document context, or use create_multimodal_embeddings for mixed media.
Finally, when retrieving information, run rerank against your query to score and prioritize the top results before passing them back to the agent.
Who is this actually for?
ML Engineers and Data Scientists who are tired of building custom, flaky pipelines. If your current knowledge base is growing too fast for simple keyword search, you need this.
Building production-grade RAG pipelines that require specific control over chunking and embedding models.
Experimenting with multimodal search or managing large datasets requiring asynchronous batch processing.
Designing knowledge retrieval systems that must operate at scale and provide surgical precision in results.
What Changes When You Connect
Better search results: Use rerank to score documents and ensure your agent only sees the highest-relevance context for its answer. This drastically improves accuracy over basic vector lookups.
Handle massive data loads: If you have millions of records, don't process them synchronously. Use create_batch to queue jobs, then check status with get_batch, keeping your agent responsive while the background work completes.
Context-aware embeddings: Forget simple text vectors. create_contextualized_embeddings embeds chunks while preserving their relationship to the full source document, cutting down on retrieval errors.
Mixed media support: Need to search a manual that contains both text and diagrams? create_multimodal_embeddings combines those sources into one searchable vector space.
Full visibility: You can track every step of this process—from initial file upload with upload_file to the final scoring—through Vinkius AI Analytics, so nothing happens in the dark.
See it in action
Technical Manual Search
An engineer needs to find a specific fix across 10 years of product manuals. They upload all PDFs using upload_file, then run create_contextualized_embeddings. When the user asks about 'error code X', the agent uses rerank on the top results to pinpoint the exact paragraph, skipping irrelevant sections.
Legal Document Review
A paralegal must review thousands of contracts for mentions of a specific clause. Instead of running 100 separate searches, they use create_batch to process all documents at once. They then analyze the results to find every instance of the key phrase.
Product Catalog Search
A user wants to search for a product based on an image and a description. The agent uses create_multimodal_embeddings on both inputs, allowing it to match visual intent with textual queries simultaneously.
Codebase Q&A
A developer asks a question about legacy code written in an old language. They use the embedding tools to vectorize the codebase documentation and then retrieve contextually relevant snippets, allowing their agent to answer with high accuracy.
The honest tradeoffs
Processing data directly
Trying to manually pass gigabytes of raw text into a single API call because it's quicker than setting up the job.
You must use upload_file first, and then trigger processing via create_batch. This handles the scale reliably without timing out.
Ignoring context
Using standard embeddings when your document is highly technical. The model treats all chunks independently, losing critical relationships between paragraphs.
Always use create_contextualized_embeddings for domain-specific text to ensure the vector understands its place within the larger source document.
Stopping at first search
Relying only on initial embeddings and presenting everything found, even if some results are vague or off-topic.
Always finish your pipeline with rerank. This cross-encoder step filters out the noise and guarantees that the user sees the absolute best matches first.
When It Fits, When It Doesn't
Use this MCP if your search application needs to understand meaning, context, or mixed media types. If you're just matching keywords in a small set of documents, a basic database lookup is fine. But if you're building an advanced knowledge retrieval system that processes large volumes of data (batching), handles complex inputs (multimodal), and demands high precision (reranking), this is your toolset. Don’t try to use simple create_embeddings when you need contextual accuracy; those are for basic text chunks only. If the problem is merely connecting two services, remember that Vinkius lets you chain multiple MCPs together, giving you a single point of access across many platforms.
Questions you might have
How do I handle massive volumes of documents with Voyage AI (AI Embeddings API)? +
You use the batch tools. First, upload_file to stage your data, then call create_batch. You can monitor progress and check status using get_batch until the job is complete.
What's the difference between `create_embeddings` and `create_contextualized_embeddings`? +
Simple embeddings treat text in isolation. Contextualized embeddings use surrounding document information to create a more accurate vector, which is critical for complex documents.
When should I use the `rerank` tool? +
Always use it before passing data to the final LLM call. It scores your initial search results against the user's query, guaranteeing you pass the most relevant context possible.
Can this MCP handle images and text together? +
Yes. Use create_multimodal_embeddings to generate a single vector space that represents both visual data (images) and descriptive text, making them searchable as one unit.
When I need to process a large dataset, what is the proper workflow for using `upload_file`? +
You must use upload_file first. This action puts the data into the system's queue, making it available for subsequent batch operations like creating embeddings.
If my embedding job fails or stalls, how do I check its status using `get_batch`? +
get_batch retrieves the current state of a specific batch job. You can use this to confirm if it's running, finished successfully, or if an error occurred.
How do I manage my data retention and clean up temporary assets using `delete_file`? +
delete_file permanently removes a file from the system. This is crucial for maintaining compliance and keeping your workspace organized after job completion.
Before running any batch operation, how do I see what files are already stored by using `list_files`? +
list_files retrieves a comprehensive list of every file in the system. This lets you check metadata and confirm your starting data sources before processing.
How does reranking improve my RAG system's accuracy? +
By using the rerank tool, your agent can take a list of potentially relevant documents and re-score them using a powerful cross-encoder model. This ensures that the most semantically relevant pieces of information are ranked first, providing better context for the LLM to answer queries.
What is the benefit of using contextualized embeddings? +
The create_contextualized_embeddings tool allows you to embed chunks of text while considering the surrounding content of the same document. This prevents loss of meaning that often happens with standard chunking, leading to much higher retrieval precision.
Can I process images and text in the same vector space? +
Yes! With create_multimodal_embeddings, you can provide interleaved sequences of text and image URLs. Voyage AI will generate a single embedding that represents the combined semantic meaning, perfect for visual or hybrid search.
We've already built the connector for Voyage AI. Just plug in your AI agents and start using Vinkius.
No hosting. No infrastructure. No complex setup.
All 13 tools are live and waiting.
You're up and running in seconds.
Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.
Built, hosted, and secured by Vinkius. You just connect and go.