Milvus MCP. Run complex vector searches from plain conversation.

Q: How do I list all vector collections in Milvus using listcollections?

You ask your agent to run listcollections. It will return a direct list of every collection name and their basic dimension count (e.g., 'imageembeddings' - Dim: 512). This is the best place to start.

Q: What if my vector search results aren't accurate? Should I use queryentities?

If your semantic search (via searchvectors) gives you too many irrelevant matches, try using queryentities. This lets you filter the results using strict scalar fields like 'producttype: electronics,' which adds precision.

Q: I need to see the schema for a collection. What tool should I use? Is it describecollection?

Yes, describecollection is the right one. It gives you the complete map of the collection's schema, including primary keys and index types. This lets you know what fields are available to filter on.

Q: Can I check how much memory my Milvus database is using?

You use getcollectionstats. You tell the agent which collection you want to check, and it returns metrics like total entity count and current physical memory usage.

Q: How do I safely remove records using the deleteentities tool?

You must use deleteentities and provide the specific primary keys of the vectors you want to remove. This action is irreversible, so double-check your list of IDs before running it.

Q: I know the exact IDs I need. Should I use getentities instead of searching?

Yes, if you have specific primary keys, getentities is your tool. It retrieves unique vector items directly by their known ID, bypassing semantic search entirely.

Q: What data format does the searchvectors tool require for its input?

The searchvectors tool requires a strict JSON array that matches the exact dimensions of your collection. Your agent must feed it this raw embedding vector data to find nearest neighbors.

Q: How do I filter search results by structured fields using queryentities?

Use queryentities and provide sophisticated scalar expressions. This lets you narrow down results by specific metadata, like date ranges or tags, before performing the vector match.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Just plug in your AI agents and start using Vinkius.

Milvus MCP Server manages vector storage and retrieval. It lets your AI agent perform Approximate Nearest Neighbor (ANN) searches on vast embedding collections, filter results using structured data fields, and audit the entire database schema—all through natural conversation.

What your AI agents can do

Delete entities

Removes specific vector records from a collection using their unique primary keys.

Describe collection

Retrieves the full schema map, including index definitions and dimensions for any specified Milvus collection.

Get collection stats

Pulls real-time statistics on a collection, reporting its total row count and current memory usage.

+ 4 more capabilities included

Perform Semantic Vector Search

The agent runs Approximate Nearest Neighbor (ANN) searches, identifying the most semantically relevant data points based on raw embedding vectors.

Filter by Structured Data Fields

You narrow search results using explicit scalar expressions to target entities based on known fields like IDs or dates, combining structure with semantic search.

Audit Collection Schema and Indexes

The agent lists all vector collections and retrieves detailed schema maps, including dimensions and primary key definitions for each one.

Monitor Database Health Metrics

You pull real-time statistics on a collection, getting the current entity count and physical memory usage to check performance.

Manage Specific Records (CRUD)

The agent can fetch specific vector items by their primary key or irreversibly delete records using that identifier.

Ask AI about this MCP

Ask ChatGPT

Ask Claude

Ask Perplexity

Supported MCP Clients

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

+ other MCP clients

Free for Subscribers

Waiting for input…

AI Agent

Milvus (Open-Source Vector Database) MCP Server: 7 Tools

These tools allow your AI client to manage Milvus vector storage. You can search embeddings, filter records by structured fields, and audit the entire database schema.

delete019d75d4

delete entities

Removes specific vector records from a collection using their unique primary keys.

describe019d75d4

describe collection

Retrieves the full schema map, including index definitions and dimensions for any specified Milvus collection.

get019d75d4

get collection stats

Pulls real-time statistics on a collection, reporting its total row count and current memory usage.

get019d75d4

get entities

Extracts unique vector items by listing them based exactly on known primary keys.

list019d75d4

list collections

Queries and lists all the named collections currently tracked inside your Milvus Vector Database instance.

query019d75d4

query entities

Filters entities by using specific scalar expressions, allowing you to query structured data fields like tags or IDs.

search019d75d4

search vectors

Finds the nearest vector neighbors by taking a raw embedding JSON array and searching across your collections.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Milvus (Open-Source Vector Database), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 4,700+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

What you can do with this MCP connector

You gotta connect your AI agent to this Milvus server for full control over vector search and storage. It lets your agent run complex operations on massive embedding collections without you having to write a single line of SDK code. You'll be able to manage everything from querying the deepest semantic layers of data to checking the health metrics of the whole database.

When you use search_vectors, your agent runs Approximate Nearest Neighbor (ANN) searches. It takes a raw embedding JSON array and finds the most semantically relevant vectors across all your collections, matching meaning instead of just keywords. You can then combine that semantic search with structure using query_entities; this lets you filter results by explicit scalar expressions—like limiting the search to records where the tag is 'VIP' or the ID falls within a specific date range.

It’s how you narrow down the needle in the haystack when both meaning and known fields matter.

To check what data lives in your system, you can use list_collections to pull a list of every named collection housed inside your Milvus instance. If you need the technical blueprint for any single one of those collections, running describe_collection pulls the full schema map. That map includes index definitions, dimensions, and primary key rules—it shows you exactly what data structure you’re dealing with.

You can also use get_collection_stats to check performance; this function reports real-time statistics on a collection, giving you its total row count and current memory usage so you know if it's running smooth.

Managing the actual records is straightforward too. If you need specific vector items, get_entities extracts unique data by listing them based exactly on known primary keys. Need to clean up some junk? You use delete_entities to remove specific vector records using their unique primary keys. This action is permanent, so you'll know what you're doing when you run it.

This suite of tools gives your agent complete control over the data lifecycle: you can find stuff semantically, filter it with structured fields, list all collections and check their schemas, monitor performance metrics like entity count and memory usage, and finally, pull or delete specific records by ID. It's a comprehensive way to operate on vector storage right through natural conversation.

How Milvus MCP Works

1 Subscribe to the server and provide your Milvus Base URL and API Key (or Zilliz Cloud Token).
2 Your AI client uses natural conversation to trigger a tool, such as list_collections or search_vectors, passing necessary context like vectors or filters.
3 The MCP Server executes the command against your Milvus instance and returns structured data—like collection names or nearest neighbor results—directly to your agent.

The bottom line is: it lets your AI client run complex vector database operations without you needing to write any Python code or manage connection details.

Who Is Milvus MCP For?

ML Engineers who need to test embedding dimensions on the fly. Search Architects who must audit collection schemas and monitor indexing performance. Software Developers building applications that require complex, AI-powered retrieval logic.

Machine Learning Engineer

Tests vector relevance and verifies embedding dimensions by talking to the agent; no manual SDK scripting needed.

Search Architect

Audits collection schemas and monitors indexing performance directly from their workspace, ensuring data integrity across multiple environments.

Software Developer

Integrates AI-powered retrieval into applications and manages vector lifecycles efficiently by calling tools like delete_entities or query_entities.

What Changes When You Connect

You run ANN searches using search_vectors, getting immediate semantic matches without writing a single line of data retrieval code. This is pure chat interaction.
The query_entities tool lets you combine meaning and structure. You can search by vector and filter by date or ID in one go, which is way more precise than simple keyword searches.
Need to know if your collection schema changed? Use describe_collection. It audits dimensions and index types instantly so you don't run into runtime errors later.
You monitor resource health using get_collection_stats. This tells you exactly how many entities a collection holds and how much memory it’s eating, letting you plan for scale.
If data gets stale or corrupted, the server lets you clean up with delete_entities by primary key. It's a controlled way to keep your search index optimized.

Real-World Use Cases

Identifying related documents for an API call.

A developer needs to find all technical guides similar to a user-provided document (the embedding). They prompt their agent: 'Find the top 5 most relevant docs in text_knowledge_base.' The agent runs search_vectors, gets the IDs, and returns them for the API call. Problem solved.

Finding user profiles by ID, then checking relevance.

An ML Engineer needs to check a specific user's profile (ID: 90210) but also wants related content. They first use get_entities on the user ID, and then they run query_entities with that ID plus a date filter. This limits their search space dramatically.

Auditing an entire dataset before launch.

A Search Architect is setting up a new product line. They start by running list_collections to verify all necessary datasets exist. Then, they run describe_collection on each one to confirm the dimensions and index types match the plan.

Cleaning up old, irrelevant data.

A developer finds that a vector collection has records for users who left last year. They use the primary keys list from get_entities (for known bad data) and pass those IDs to delete_entities, keeping the index clean.

The Tradeoffs

Treating Milvus like a simple key-value store.

The user just tries to run query_entities but forgets that they need to scope it first, or they try searching without providing any embedding vector at all. The query fails because the structure is wrong.

→ Always start with list_collections to know what's available. If you are searching semantically, you must use search_vectors and provide a raw embedding JSON array.

Running heavy searches without checking schema first.

The user runs search_vectors but the wrong collection is targeted, or the vector dimensions don't match what was stored. This results in vague errors and wasted compute cycles.

→ First, use describe_collection to verify the exact dimension count of the target collection. Then, make sure your input JSON array matches that dimension.

Attempting multi-step updates manually.

The user tries to manually copy IDs from a list of results into a separate tool call for deletion. This is slow and error-prone across multiple tabs and tools.

→ Use the agent's ability to chain calls: get the IDs with get_entities, review them, and then pass that resulting ID set directly to delete_entities.

When It Fits, When It Doesn't

You should use this server if your core data problem requires semantic similarity—meaning you need to find results based on what the text or image means, not just what it contains. If simple filtering by a structured ID (like 'user:123') is enough, you can probably use other tools. But if you are connecting embeddings to an agent's decision-making process, this is essential.

Don't use this if your data lives in a traditional relational database and you only need simple lookups. For those cases, stick to standard SQL APIs. Use this when the relationship between pieces of data is measured by vector distance. If you just need to list collections or check stats, then list_collections and get_collection_stats are your starting points.

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Milvus. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 7 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

delete_entities describe_collection get_collection_stats get_entities list_collections query_entities search_vectors

Checking a database schema shouldn't feel like reading an API manual.

Today, figuring out what data is even in the system means clicking through multiple dashboards: checking the collection names, then manually hopping to the documentation page for dimensions, and finally running separate queries just to see if the index was built correctly. It's a slow, multi-step process of guesswork.

With this MCP server, you ask your agent to audit the system. One prompt—'What collections do I have?'—gets you the list; another—'Show me the schema for X'—gives you the full technical breakdown instantly. You get immediate certainty.

Milvus MCP Server: Get data insights, not just vague descriptions.

Manual data management involves copying IDs from one search result to another tool call for cleanup or verification—a painful process of copy/pasting identifiers across multiple windows. It slows down iteration and makes debugging nearly impossible.

The MCP server handles the flow: it pulls results (e.g., using `get_entities`), packages those results, and sends them directly to a clean-up tool like `delete_entities`. You just tell your agent what you want gone, and it runs the entire sequence.

Common Questions About Milvus MCP

How do I list all vector collections in Milvus using list_collections? +

You ask your agent to run list_collections. It will return a direct list of every collection name and their basic dimension count (e.g., 'image_embeddings' - Dim: 512). This is the best place to start.

What if my vector search results aren't accurate? Should I use query_entities? +

If your semantic search (via search_vectors) gives you too many irrelevant matches, try using query_entities. This lets you filter the results using strict scalar fields like 'product_type: electronics,' which adds precision.

I need to see the schema for a collection. What tool should I use? Is it describe_collection? +

Yes, describe_collection is the right one. It gives you the complete map of the collection's schema, including primary keys and index types. This lets you know what fields are available to filter on.

Can I check how much memory my Milvus database is using? +

You use get_collection_stats. You tell the agent which collection you want to check, and it returns metrics like total entity count and current physical memory usage.

How do I safely remove records using the delete_entities tool? +

You must use delete_entities and provide the specific primary keys of the vectors you want to remove. This action is irreversible, so double-check your list of IDs before running it.

I know the exact IDs I need. Should I use get_entities instead of searching? +

Yes, if you have specific primary keys, get_entities is your tool. It retrieves unique vector items directly by their known ID, bypassing semantic search entirely.

What data format does the search_vectors tool require for its input? +

The search_vectors tool requires a strict JSON array that matches the exact dimensions of your collection. Your agent must feed it this raw embedding vector data to find nearest neighbors.

How do I filter search results by structured fields using query_entities? +

Use query_entities and provide sophisticated scalar expressions. This lets you narrow down results by specific metadata, like date ranges or tags, before performing the vector match.

How do I perform an ANN search through my agent? +

Use the search_vectors tool by providing the collection name and a JSON float array matching the collection's dimensions. Your agent will perform an Approximate Nearest Neighbor search and return the most semantically relevant entities.

Can I filter results using structured fields instead of just vectors? +

Yes. Use the query_entities tool with a Milvus-style filter expression. This allows you to retrieve entities based on primary keys, tags, or other scalar fields without necessarily performing a vector similarity search.

How do I check the schema and dimension requirements for a Milvus collection? +

The describe_collection tool retrieves the complete schema mapping. Your agent will report the required vector dimensions, index types, and primary key names, helping you ensure your search queries are compatible with the database logic.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript