Milvus MCP. Run complex vector searches from plain conversation.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
Milvus MCP Server manages vector storage and retrieval. It lets your AI agent perform Approximate Nearest Neighbor (ANN) searches on vast embedding collections, filter results using structured data fields, and audit the entire database schema—all through natural conversation.
What your AI agents can do
Delete entities
Removes specific vector records from a collection using their unique primary keys.
Describe collection
Retrieves the full schema map, including index definitions and dimensions for any specified Milvus collection.
Get collection stats
Pulls real-time statistics on a collection, reporting its total row count and current memory usage.
The agent runs Approximate Nearest Neighbor (ANN) searches, identifying the most semantically relevant data points based on raw embedding vectors.
You narrow search results using explicit scalar expressions to target entities based on known fields like IDs or dates, combining structure with semantic search.
The agent lists all vector collections and retrieves detailed schema maps, including dimensions and primary key definitions for each one.
You pull real-time statistics on a collection, getting the current entity count and physical memory usage to check performance.
The agent can fetch specific vector items by their primary key or irreversibly delete records using that identifier.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
Milvus (Open-Source Vector Database) MCP Server: 7 Tools
These tools allow your AI client to manage Milvus vector storage. You can search embeddings, filter records by structured fields, and audit the entire database schema.
019d75d4delete entities
Removes specific vector records from a collection using their unique primary keys.
019d75d4describe collection
Retrieves the full schema map, including index definitions and dimensions for any specified Milvus collection.
019d75d4get collection stats
Pulls real-time statistics on a collection, reporting its total row count and current memory usage.
019d75d4get entities
Extracts unique vector items by listing them based exactly on known primary keys.
019d75d4list collections
Queries and lists all the named collections currently tracked inside your Milvus Vector Database instance.
019d75d4query entities
Filters entities by using specific scalar expressions, allowing you to query structured data fields like tags or IDs.
019d75d4search vectors
Finds the nearest vector neighbors by taking a raw embedding JSON array and searching across your collections.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Milvus (Open-Source Vector Database), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
You gotta connect your AI agent to this Milvus server for full control over vector search and storage. It lets your agent run complex operations on massive embedding collections without you having to write a single line of SDK code. You'll be able to manage everything from querying the deepest semantic layers of data to checking the health metrics of the whole database.
When you use search_vectors, your agent runs Approximate Nearest Neighbor (ANN) searches. It takes a raw embedding JSON array and finds the most semantically relevant vectors across all your collections, matching meaning instead of just keywords. You can then combine that semantic search with structure using query_entities; this lets you filter results by explicit scalar expressions—like limiting the search to records where the tag is 'VIP' or the ID falls within a specific date range.
It’s how you narrow down the needle in the haystack when both meaning and known fields matter.
To check what data lives in your system, you can use list_collections to pull a list of every named collection housed inside your Milvus instance. If you need the technical blueprint for any single one of those collections, running describe_collection pulls the full schema map. That map includes index definitions, dimensions, and primary key rules—it shows you exactly what data structure you’re dealing with.
You can also use get_collection_stats to check performance; this function reports real-time statistics on a collection, giving you its total row count and current memory usage so you know if it's running smooth.
Managing the actual records is straightforward too. If you need specific vector items, get_entities extracts unique data by listing them based exactly on known primary keys. Need to clean up some junk? You use delete_entities to remove specific vector records using their unique primary keys. This action is permanent, so you'll know what you're doing when you run it.
This suite of tools gives your agent complete control over the data lifecycle: you can find stuff semantically, filter it with structured fields, list all collections and check their schemas, monitor performance metrics like entity count and memory usage, and finally, pull or delete specific records by ID. It's a comprehensive way to operate on vector storage right through natural conversation.
How Milvus MCP Works
- 1 Subscribe to the server and provide your Milvus Base URL and API Key (or Zilliz Cloud Token).
- 2 Your AI client uses natural conversation to trigger a tool, such as
list_collectionsorsearch_vectors, passing necessary context like vectors or filters. - 3 The MCP Server executes the command against your Milvus instance and returns structured data—like collection names or nearest neighbor results—directly to your agent.
The bottom line is: it lets your AI client run complex vector database operations without you needing to write any Python code or manage connection details.
Who Is Milvus MCP For?
ML Engineers who need to test embedding dimensions on the fly. Search Architects who must audit collection schemas and monitor indexing performance. Software Developers building applications that require complex, AI-powered retrieval logic.
Tests vector relevance and verifies embedding dimensions by talking to the agent; no manual SDK scripting needed.
Audits collection schemas and monitors indexing performance directly from their workspace, ensuring data integrity across multiple environments.
Integrates AI-powered retrieval into applications and manages vector lifecycles efficiently by calling tools like delete_entities or query_entities.
What Changes When You Connect
- You run ANN searches using
search_vectors, getting immediate semantic matches without writing a single line of data retrieval code. This is pure chat interaction. - The
query_entitiestool lets you combine meaning and structure. You can search by vector and filter by date or ID in one go, which is way more precise than simple keyword searches. - Need to know if your collection schema changed? Use
describe_collection. It audits dimensions and index types instantly so you don't run into runtime errors later. - You monitor resource health using
get_collection_stats. This tells you exactly how many entities a collection holds and how much memory it’s eating, letting you plan for scale. - If data gets stale or corrupted, the server lets you clean up with
delete_entitiesby primary key. It's a controlled way to keep your search index optimized.
Real-World Use Cases
Identifying related documents for an API call.
A developer needs to find all technical guides similar to a user-provided document (the embedding). They prompt their agent: 'Find the top 5 most relevant docs in text_knowledge_base.' The agent runs search_vectors, gets the IDs, and returns them for the API call. Problem solved.
Finding user profiles by ID, then checking relevance.
An ML Engineer needs to check a specific user's profile (ID: 90210) but also wants related content. They first use get_entities on the user ID, and then they run query_entities with that ID plus a date filter. This limits their search space dramatically.
Auditing an entire dataset before launch.
A Search Architect is setting up a new product line. They start by running list_collections to verify all necessary datasets exist. Then, they run describe_collection on each one to confirm the dimensions and index types match the plan.
Cleaning up old, irrelevant data.
A developer finds that a vector collection has records for users who left last year. They use the primary keys list from get_entities (for known bad data) and pass those IDs to delete_entities, keeping the index clean.
The Tradeoffs
Treating Milvus like a simple key-value store.
The user just tries to run query_entities but forgets that they need to scope it first, or they try searching without providing any embedding vector at all. The query fails because the structure is wrong.
→
Always start with list_collections to know what's available. If you are searching semantically, you must use search_vectors and provide a raw embedding JSON array.
Running heavy searches without checking schema first.
The user runs search_vectors but the wrong collection is targeted, or the vector dimensions don't match what was stored. This results in vague errors and wasted compute cycles.
→
First, use describe_collection to verify the exact dimension count of the target collection. Then, make sure your input JSON array matches that dimension.
Attempting multi-step updates manually.
The user tries to manually copy IDs from a list of results into a separate tool call for deletion. This is slow and error-prone across multiple tabs and tools.
→
Use the agent's ability to chain calls: get the IDs with get_entities, review them, and then pass that resulting ID set directly to delete_entities.
When It Fits, When It Doesn't
You should use this server if your core data problem requires semantic similarity—meaning you need to find results based on what the text or image means, not just what it contains. If simple filtering by a structured ID (like 'user:123') is enough, you can probably use other tools. But if you are connecting embeddings to an agent's decision-making process, this is essential.
Don't use this if your data lives in a traditional relational database and you only need simple lookups. For those cases, stick to standard SQL APIs. Use this when the relationship between pieces of data is measured by vector distance. If you just need to list collections or check stats, then list_collections and get_collection_stats are your starting points.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Milvus. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 7 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Checking a database schema shouldn't feel like reading an API manual.
Today, figuring out what data is even in the system means clicking through multiple dashboards: checking the collection names, then manually hopping to the documentation page for dimensions, and finally running separate queries just to see if the index was built correctly. It's a slow, multi-step process of guesswork.
With this MCP server, you ask your agent to audit the system. One prompt—'What collections do I have?'—gets you the list; another—'Show me the schema for X'—gives you the full technical breakdown instantly. You get immediate certainty.
Milvus MCP Server: Get data insights, not just vague descriptions.
Manual data management involves copying IDs from one search result to another tool call for cleanup or verification—a painful process of copy/pasting identifiers across multiple windows. It slows down iteration and makes debugging nearly impossible.
The MCP server handles the flow: it pulls results (e.g., using `get_entities`), packages those results, and sends them directly to a clean-up tool like `delete_entities`. You just tell your agent what you want gone, and it runs the entire sequence.
Common Questions About Milvus MCP
How do I list all vector collections in Milvus using list_collections? +
You ask your agent to run list_collections. It will return a direct list of every collection name and their basic dimension count (e.g., 'image_embeddings' - Dim: 512). This is the best place to start.
What if my vector search results aren't accurate? Should I use query_entities? +
If your semantic search (via search_vectors) gives you too many irrelevant matches, try using query_entities. This lets you filter the results using strict scalar fields like 'product_type: electronics,' which adds precision.
I need to see the schema for a collection. What tool should I use? Is it describe_collection? +
Yes, describe_collection is the right one. It gives you the complete map of the collection's schema, including primary keys and index types. This lets you know what fields are available to filter on.
Can I check how much memory my Milvus database is using? +
You use get_collection_stats. You tell the agent which collection you want to check, and it returns metrics like total entity count and current physical memory usage.
How do I safely remove records using the delete_entities tool? +
You must use delete_entities and provide the specific primary keys of the vectors you want to remove. This action is irreversible, so double-check your list of IDs before running it.
I know the exact IDs I need. Should I use get_entities instead of searching? +
Yes, if you have specific primary keys, get_entities is your tool. It retrieves unique vector items directly by their known ID, bypassing semantic search entirely.
What data format does the search_vectors tool require for its input? +
The search_vectors tool requires a strict JSON array that matches the exact dimensions of your collection. Your agent must feed it this raw embedding vector data to find nearest neighbors.
How do I filter search results by structured fields using query_entities? +
Use query_entities and provide sophisticated scalar expressions. This lets you narrow down results by specific metadata, like date ranges or tags, before performing the vector match.
How do I perform an ANN search through my agent? +
Use the search_vectors tool by providing the collection name and a JSON float array matching the collection's dimensions. Your agent will perform an Approximate Nearest Neighbor search and return the most semantically relevant entities.
Can I filter results using structured fields instead of just vectors? +
Yes. Use the query_entities tool with a Milvus-style filter expression. This allows you to retrieve entities based on primary keys, tags, or other scalar fields without necessarily performing a vector similarity search.
How do I check the schema and dimension requirements for a Milvus collection? +
The describe_collection tool retrieves the complete schema mapping. Your agent will report the required vector dimensions, index types, and primary key names, helping you ensure your search queries are compatible with the database logic.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
New Relic
Monitor and query your entire stack via New Relic NerdGraph — track entities, NRQL, and alerts directly from your AI agent.
Grafana
Manage observability via Grafana — search and inspect dashboards, monitor alerts, and handle data sources directly from any AI agent.
Notion Alternative
Manage Notion pages, databases and blocks via API — search content, query databases, create rows and append blocks from any AI agent.
You might also like
Substack
Equip your AI agent with direct access to Substack — manage publications, draft posts, and track subscriber data without opening the Substack editor.
Loom (Async Video Messaging)
Manage video messages via Loom — retrieve metadata, handle timeline comments, and track viewer analytics.
Openli
Generate privacy policies, cookie consent banners, and legal compliance documents for your website with automated updates.