Zilliz Cloud MCP. Run complex vector searches via natural conversation.

Q: How do I check if my collection exists using zilliz-cloud?

Run listcollections. This tool gets and displays every container name in your cluster, letting you see exactly what's available without needing to know the full schema yet.

Q: Is searchvectors the only way to search data?

No. You use queryentities when you need to filter results based on non-vector attributes, like filtering by 'product type' or 'date.' Then, you combine that with searchvectors for a targeted result set.

Q: What is the difference between insertentities and queryentities?

insertentities adds data to the collection. queryentities reads data, but only using metadata filters—it doesn't perform a vector search.

Q: Do I need to use loadcollection every time I run a search?

Yes. You should call loadcollection before running intensive searches because it moves the container into memory, which is necessary for optimal performance and availability.

Q: Before running insertentities, how do I validate the required schema and dimensions using describecollection?

The describecollection tool returns the full, structured schema. This output specifies every field name, its expected data type (like FloatVector), and the precise dimensionality needed for successful insertion.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Just plug in your AI agents and start using Vinkius.

Zilliz Cloud MCP Server lets your AI agent manage vector collections for high-performance similarity searches. Use natural language to list, create, drop, insert data, and run complex Approximate Nearest Neighbor (ANN) queries using customizable metrics.

It handles the entire lifecycle of your vector database right from your chat client.

What your AI agents can do

Create collection

Builds a brand new vector container within your cluster.

Delete entities

Removes specific data points from an existing collection.

Describe collection

Fetches the schema and status details for a specified vector collection.

+ 7 more capabilities included

Manage Vector Schema

You can define, read, and delete the entire structure of your vector collections.

Perform Similarity Searches

The server runs high-speed Approximate Nearest Neighbor (ANN) searches to find data vectors closest in meaning or space.

Filter Data by Metadata

You can query entities using specific filters—like date ranges, user IDs, or product categories—before running a search.

Control Memory Resources

The agent loads and releases collections to keep the cluster efficient and ensure fast search availability.

Seed and Prune Data

You can insert new vector data points or delete outdated records from your existing collections.

Ask AI about this MCP

Ask ChatGPT

Ask Claude

Ask Perplexity

Supported MCP Clients

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

+ other MCP clients

Free for Subscribers

Waiting for input…

AI Agent

Zilliz Cloud MCP Server: 10 Tools for Vector Operations

Use these tools to manage your vector database's entire life cycle—from creating collections and inserting data points to running complex similarity searches.

create019d7628

create collection

Builds a brand new vector container within your cluster.

delete019d7628

delete entities

Removes specific data points from an existing collection.

describe019d7628

describe collection

Fetches the schema and status details for a specified vector collection.

drop019d7628

drop collection

Permanently removes an entire vector container from your cluster.

insert019d7628

insert entities

Adds new vectors and associated metadata into a collection.

list019d7628

list collections

Gets a list of every vector container available in your cluster.

load019d7628

load collection

Moves an entire collection into active memory for faster searching.

query019d7628

query entities

Finds records by applying complex filters to the metadata, ignoring vector similarity for now.

release019d7628

release collection

Removes a collection from active memory to free up cluster resources.

search019d7628

search vectors

Performs the core function: finding vectors that are mathematically similar using customizable metrics.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Zilliz Cloud, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 4,700+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

What you can do with this MCP connector

This MCP server lets your AI agent handle every part of vector collection management for high-performance similarity searches. You'll use natural language instructions to list, build, drop, insert data, and run complex Approximate Nearest Neighbor (ANN) queries using custom metrics. It handles the entire lifecycle of your vector database right from your chat client.

Managing Your Vector Schema and Containers

To see what containers you've got running, your agent calls list_collections, which gives you a full roster of every single vector container in the cluster. If you need to know the specific rules or status of one container, describe_collection fetches that schema detail for you. You can build a brand new vector storage area using create_collection.

When you're done with a container and want it gone forever, calling drop_collection permanently removes it from your cluster. If you just want to completely clean out all the data points in an active collection without deleting the container itself, you use delete_entities, which scrubs specific records.

Seeding and Pruning Data Points

When you get fresh data, your agent uses insert_entities to add new vectors and their associated metadata into a specified collection. For keeping your database clean—and this is key for performance—you can delete old or outdated records using delete_entities. You're always in control of what stays in the system.

Advanced Querying and Search Mechanisms

The server runs high-speed Approximate Nearest Neighbor (ANN) searches when you call search_vectors, finding data vectors that are mathematically closest in meaning or space, all while letting you define your own metrics. If you don't need a similarity search—maybe you just want to filter records based on criteria like a date range or a user ID—you use query_entities to find those specific records using complex filters first.

You can also get basic results by calling query_entities, which finds records by applying filters to the metadata, ignoring vector similarity for now.

Resource and Memory Control

To keep your cluster running fast, you're in charge of memory management. If a collection is needed often, your agent can call load_collection to move that entire container into active memory, making search results faster. When the job's done and you need those resources back for other stuff, you use release_collection to remove the collection from active memory, freeing up cluster space.

How Zilliz Cloud MCP Works

1 Subscribe to the Zilliz Cloud MCP Server and provide your unique Cluster Endpoint and API Key.
2 Your AI agent receives permission to interact with the vector database via explicit tools like search_vectors or list_collections.
3 You ask your agent a natural language question (e.g., 'Find all documents about Q3 earnings for users in California'). The agent selects and runs the necessary tool calls to get the result.

The bottom line is that your AI client handles all the backend API calls, letting you manage vector data without writing a single line of code or managing complex boilerplate sequences.

Who Is Zilliz Cloud MCP For?

This toolset is for developers and engineers whose job requires making sense of massive amounts of unstructured data. It's for the AI Engineer who gets tired of spending hours writing repetitive Python scripts just to test a new vector index, or the Data Scientist who needs to quickly inspect cluster health without dropping into a terminal.

AI Engineer

Uses the server to rapidly prototype RAG pipelines, testing schema changes and similarity search results conversationally.

Data Scientist

Monitors cluster resource usage by calling load_collection and release_collection, or running metadata checks using describe_collection to validate data distribution.

Backend Developer

Integrates vector database management into a production workflow, letting the agent handle the sequencing of insert_entities and search_vectors calls for end-user requests.

What Changes When You Connect

Need to find related documents? Use search_vectors to run high-speed Approximate Nearest Neighbor (ANN) queries, finding semantic matches faster than keyword search alone. You just describe the concept; the server does the math.
Keeping your cluster clean is key. If old data points are cluttering things up, use delete_entities or drop_collection to prune records and keep only what's relevant for current tasks.
Don't start coding just to check the schema. Use describe_collection to instantly verify a collection's fields, dimensions, and status right in your chat window before writing any code.
Running complex searches often requires context switching. With load_collection and release_collection, you manage memory resources directly through conversation, ensuring peak performance when needed.
Sometimes the metadata is more important than the vector. Use query_entities to narrow down records using boolean logic (e.g., 'Product X AND date > 2023') before running a similarity search.

Real-World Use Cases

Debugging a New Knowledge Base

A developer is setting up a new knowledge base but isn't sure if the data was indexed correctly. Instead of writing multiple GET endpoints, they ask their agent to run list_collections, then use describe_collection on the target collection. This confirms the correct schema and dimensions are available before any search runs.

Running a Targeted Product Search

A user asks, 'Show me all high-priority support tickets mentioning billing issues that happened last month.' The agent first uses query_entities to filter by metadata (high-priority, billing, last month), then runs search_vectors on the resulting subset for true semantic relevance.

Cleaning Up Stale Data

The ops team needs to decommission an old project's data. They first run list_collections to find the container, then use drop_collection on the specific name. This permanently wipes all associated vectors and metadata without manual API calls.

Optimizing Search Performance

Before a massive search operation, the agent is instructed to call load_collection on the target index. After the complex search_vectors completes, it automatically calls release_collection, ensuring the cluster returns to its baseline memory state.

The Tradeoffs

Writing custom resource cleanup code

The developer writes a complex try/finally block in Python just to ensure they call release_collection if the search fails halfway through. This adds boilerplate and is prone to bugs.

→ Instead of writing that logic, let your agent handle it. You tell the agent: 'Load collection X, run search, then release collection X.' The conversation manages the sequence automatically.

Assuming simple keyword matching works

The user tries to find documents using only basic text queries or SQL-like query_entities without accounting for vector similarity. This misses conceptually similar but worded differently data.

→ You must use search_vectors. It's designed specifically to compare the meaning of your input query against vectors in the collection, giving you true semantic results.

Ignoring Collection Status

The code attempts a complex insert_entities call on an index that was never properly created or dropped, causing runtime errors.

→ Always start by running list_collections to verify the container exists. If it needs definition, use create_collection first.

When It Fits, When It Doesn't

Use this server if your core problem involves finding relevant data based on meaning (i.e., semantic search) or managing the full lifecycle of vector indices. This is fundamentally different from traditional SQL databases because you aren't querying rows; you're comparing high-dimensional vectors. Use it when you need to know 'What does this mean?' instead of 'Does this match these exact words?'.

Don't use this if your application only requires simple, structured lookups (e.g., checking a user ID or fetching a record by primary key). For those tasks, a standard relational database is overkill and faster. This toolset is for the advanced data layer: when you need to combine filtering (query_entities) with deep semantic searching (search_vectors).

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Zilliz Cloud. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 10 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

create_collection delete_entities describe_collection drop_collection insert_entities list_collections load_collection query_entities release_collection search_vectors

Debugging vector pipelines shouldn't feel like writing API documentation.

Right now, testing a new retrieval pipeline means jumping between code editors and terminal windows. You write the insert logic, run it; check `list_collections`; then you try to query it by filtering metadata, which requires one set of boilerplate calls. If any step fails—say, your collection isn't loaded—you have to manually figure out which tool call is missing.

With this MCP server, the whole process happens conversationally. You simply tell your agent: 'First, create the collection with these fields. Then, insert this data. Finally, search it.' It orchestrates the entire sequence for you. That's what saves hours.

Zilliz Cloud MCP Server lets you manage vector collections.

The tedious manual steps—checking if a collection exists, remembering to load it before searching, and then manually releasing it when done—all vanish. You don't worry about the state machine; you just describe the outcome you want.

It’s pure intent. The agent translates your high-level goal ('I need to search my product docs for anything related to warranty claims') directly into a sequence of calls (`load_collection` -> `query_entities` -> `search_vectors`) without you ever seeing the underlying API complexity.

Common Questions About Zilliz Cloud MCP

How do I check if my collection exists using zilliz-cloud? +

Run list_collections. This tool gets and displays every container name in your cluster, letting you see exactly what's available without needing to know the full schema yet.

Is `search_vectors` the only way to search data? +

No. You use query_entities when you need to filter results based on non-vector attributes, like filtering by 'product type' or 'date.' Then, you combine that with search_vectors for a targeted result set.

What is the difference between `insert_entities` and `query_entities`? +

insert_entities adds data to the collection. query_entities reads data, but only using metadata filters—it doesn't perform a vector search.

Do I need to use `load_collection` every time I run a search? +

Yes. You should call load_collection before running intensive searches because it moves the container into memory, which is necessary for optimal performance and availability.

When using `list_collections`, what permissions are required for my agent to access the data? +

Your AI client needs read/write credentials configured in your Vinkius subscription. These specific API keys grant the necessary rights to list, describe, and manipulate collections within the Zilliz cluster.

How does using `load_collection` optimize performance compared to just running `search_vectors`? +

Loading a collection into memory greatly boosts search speed. It pre-indexes the vector data structure, allowing your agent to perform faster Approximate Nearest Neighbor (ANN) calculations when you call search_vectors.

Before running `insert_entities`, how do I validate the required schema and dimensions using `describe_collection`? +

The describe_collection tool returns the full, structured schema. This output specifies every field name, its expected data type (like FloatVector), and the precise dimensionality needed for successful insertion.

If I run a large batch of deletions using `delete_entities`, how does the API handle failed records? +

The system doesn't fail entirely; it returns a detailed log listing all IDs that could not be deleted. Your agent can then process this specific list to retry or manually skip problematic entities.

How do I find my Cluster Endpoint? +

You can find your Cluster Endpoint in the Zilliz Cloud Console under the 'Cluster Details' page. It typically looks like https://in01-xxxxxxxxxxxx.vectordb.zillizcloud.com.

Why do I need to 'load' a collection before searching? +

Zilliz requires collections to be loaded into memory to perform high-performance similarity searches. Use the load_collection tool to make your data available for search.

Can I filter my vector search using metadata? +

Yes, Zilliz supports hybrid search. You can use the query_entities tool for metadata-only filtering or include filtering expressions in your search_vectors JSON configuration.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript