# Milvus MCP

> Milvus MCP Server manages vector storage and retrieval. It lets your AI agent perform Approximate Nearest Neighbor (ANN) searches on vast embedding collections, filter results using structured data fields, and audit the entire database schema—all through natural conversation.

## Overview
- **Category:** loved-by-devs
- **Price:** Free
- **Tags:** vector-search, embeddings, similarity-search, ai-infrastructure, data-retrieval

## Description

You gotta connect your AI agent to this Milvus server for full control over vector search and storage. It lets your agent run complex operations on massive embedding collections without you having to write a single line of SDK code. You'll be able to manage everything from querying the deepest semantic layers of data to checking the health metrics of the whole database.

When you use `search_vectors`, your agent runs Approximate Nearest Neighbor (ANN) searches. It takes a raw embedding JSON array and finds the most semantically relevant vectors across all your collections, matching meaning instead of just keywords. You can then combine that semantic search with structure using `query_entities`; this lets you filter results by explicit scalar expressions—like limiting the search to records where the tag is 'VIP' or the ID falls within a specific date range. It’s how you narrow down the needle in the haystack when both meaning and known fields matter.

To check what data lives in your system, you can use `list_collections` to pull a list of every named collection housed inside your Milvus instance. If you need the technical blueprint for any single one of those collections, running `describe_collection` pulls the full schema map. That map includes index definitions, dimensions, and primary key rules—it shows you exactly what data structure you’re dealing with. You can also use `get_collection_stats` to check performance; this function reports real-time statistics on a collection, giving you its total row count and current memory usage so you know if it's running smooth.

Managing the actual records is straightforward too. If you need specific vector items, `get_entities` extracts unique data by listing them based exactly on known primary keys. Need to clean up some junk? You use `delete_entities` to remove specific vector records using their unique primary keys. This action is permanent, so you'll know what you're doing when you run it.

This suite of tools gives your agent complete control over the data lifecycle: you can find stuff semantically, filter it with structured fields, list all collections and check their schemas, monitor performance metrics like entity count and memory usage, and finally, pull or delete specific records by ID. It's a comprehensive way to operate on vector storage right through natural conversation.

## Tools

### delete_entities
Removes specific vector records from a collection using their unique primary keys.

### describe_collection
Retrieves the full schema map, including index definitions and dimensions for any specified Milvus collection.

### get_collection_stats
Pulls real-time statistics on a collection, reporting its total row count and current memory usage.

### get_entities
Extracts unique vector items by listing them based exactly on known primary keys.

### list_collections
Queries and lists all the named collections currently tracked inside your Milvus Vector Database instance.

### query_entities
Filters entities by using specific scalar expressions, allowing you to query structured data fields like tags or IDs.

### search_vectors
Finds the nearest vector neighbors by taking a raw embedding JSON array and searching across your collections.

## Prompt Examples

**Prompt:** 
```
List all vector collections in my Milvus instance
```

**Response:** 
```
I've retrieved 3 collections from your Milvus instance: 'image_embeddings' (Dim: 512), 'text_knowledge_base' (Dim: 1536), and 'user_profiles' (Dim: 768). Which one would you like to check the stats or schema for?
```

**Prompt:** 
```
Search collection 'text_knowledge_base' for vector: [0.1, -0.2, ...]
```

**Response:** 
```
Executing ANN search… I've identified the 5 nearest neighbors. The top match has a similarity score of 0.94 and maps to entity ID '12345'. Other results include related technical documentation fragments. Would you like the full scalar data for these entities?
```

**Prompt:** 
```
Show me the row count and memory stats for collection 'image_embeddings'
```

**Response:** 
```
Retrieving stats… The 'image_embeddings' collection contains 1,250,000 entities. It is currently occupying approximately 2.4 GB of memory. All indices are loaded and healthy. No anomalous data distribution detected.
```

## Capabilities

### Perform Semantic Vector Search
The agent runs Approximate Nearest Neighbor (ANN) searches, identifying the most semantically relevant data points based on raw embedding vectors.

### Filter by Structured Data Fields
You narrow search results using explicit scalar expressions to target entities based on known fields like IDs or dates, combining structure with semantic search.

### Audit Collection Schema and Indexes
The agent lists all vector collections and retrieves detailed schema maps, including dimensions and primary key definitions for each one.

### Monitor Database Health Metrics
You pull real-time statistics on a collection, getting the current entity count and physical memory usage to check performance.

### Manage Specific Records (CRUD)
The agent can fetch specific vector items by their primary key or irreversibly delete records using that identifier.

## Use Cases

### Identifying related documents for an API call.
A developer needs to find all technical guides similar to a user-provided document (the embedding). They prompt their agent: 'Find the top 5 most relevant docs in `text_knowledge_base`.' The agent runs `search_vectors`, gets the IDs, and returns them for the API call. Problem solved.

### Finding user profiles by ID, then checking relevance.
An ML Engineer needs to check a specific user's profile (ID: 90210) but also wants related content. They first use `get_entities` on the user ID, and then they run `query_entities` with that ID plus a date filter. This limits their search space dramatically.

### Auditing an entire dataset before launch.
A Search Architect is setting up a new product line. They start by running `list_collections` to verify all necessary datasets exist. Then, they run `describe_collection` on each one to confirm the dimensions and index types match the plan.

### Cleaning up old, irrelevant data.
A developer finds that a vector collection has records for users who left last year. They use the primary keys list from `get_entities` (for known bad data) and pass those IDs to `delete_entities`, keeping the index clean.

## Benefits

- You run ANN searches using `search_vectors`, getting immediate semantic matches without writing a single line of data retrieval code. This is pure chat interaction.
- The `query_entities` tool lets you combine meaning and structure. You can search by vector *and* filter by date or ID in one go, which is way more precise than simple keyword searches.
- Need to know if your collection schema changed? Use `describe_collection`. It audits dimensions and index types instantly so you don't run into runtime errors later.
- You monitor resource health using `get_collection_stats`. This tells you exactly how many entities a collection holds and how much memory it’s eating, letting you plan for scale.
- If data gets stale or corrupted, the server lets you clean up with `delete_entities` by primary key. It's a controlled way to keep your search index optimized.

## How It Works

The bottom line is: it lets your AI client run complex vector database operations without you needing to write any Python code or manage connection details.

1. Subscribe to the server and provide your Milvus Base URL and API Key (or Zilliz Cloud Token).
2. Your AI client uses natural conversation to trigger a tool, such as `list_collections` or `search_vectors`, passing necessary context like vectors or filters.
3. The MCP Server executes the command against your Milvus instance and returns structured data—like collection names or nearest neighbor results—directly to your agent.

## Frequently Asked Questions

**How do I list all vector collections in Milvus using list_collections?**
You ask your agent to run `list_collections`. It will return a direct list of every collection name and their basic dimension count (e.g., 'image_embeddings' - Dim: 512). This is the best place to start.

**What if my vector search results aren't accurate? Should I use query_entities?**
If your semantic search (via `search_vectors`) gives you too many irrelevant matches, try using `query_entities`. This lets you filter the results using strict scalar fields like 'product_type: electronics,' which adds precision.

**I need to see the schema for a collection. What tool should I use? Is it describe_collection?**
Yes, `describe_collection` is the right one. It gives you the complete map of the collection's schema, including primary keys and index types. This lets you know what fields are available to filter on.

**Can I check how much memory my Milvus database is using?**
You use `get_collection_stats`. You tell the agent which collection you want to check, and it returns metrics like total entity count and current physical memory usage.

**How do I safely remove records using the delete_entities tool?**
You must use `delete_entities` and provide the specific primary keys of the vectors you want to remove. This action is irreversible, so double-check your list of IDs before running it.

**I know the exact IDs I need. Should I use get_entities instead of searching?**
Yes, if you have specific primary keys, `get_entities` is your tool. It retrieves unique vector items directly by their known ID, bypassing semantic search entirely.

**What data format does the search_vectors tool require for its input?**
The `search_vectors` tool requires a strict JSON array that matches the exact dimensions of your collection. Your agent must feed it this raw embedding vector data to find nearest neighbors.

**How do I filter search results by structured fields using query_entities?**
Use `query_entities` and provide sophisticated scalar expressions. This lets you narrow down results by specific metadata, like date ranges or tags, before performing the vector match.

**How do I perform an ANN search through my agent?**
Use the `search_vectors` tool by providing the collection name and a JSON float array matching the collection's dimensions. Your agent will perform an Approximate Nearest Neighbor search and return the most semantically relevant entities.

**Can I filter results using structured fields instead of just vectors?**
Yes. Use the `query_entities` tool with a Milvus-style filter expression. This allows you to retrieve entities based on primary keys, tags, or other scalar fields without necessarily performing a vector similarity search.

**How do I check the schema and dimension requirements for a Milvus collection?**
The `describe_collection` tool retrieves the complete schema mapping. Your agent will report the required vector dimensions, index types, and primary key names, helping you ensure your search queries are compatible with the database logic.