ClickHouse (Vector Search) MCP. Query Vector Embeddings and Data via Natural Language
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
ClickHouse (Vector Search) MCP Server lets you manage vector embeddings and run SQL queries directly from your AI client. List databases, inspect schemas, and perform high-speed vector searches using cosineDistance or L2Distance metrics.
It connects your analytical data and vector stores to your workflow without boilerplate code.
What your AI agents can do
Describe table
Extracts the schema details for a table, showing column names and data types.
Execute sql
Runs any SQL query (SELECT, DML, DDL) against the data cluster.
Get table stats
Pulls internal structural data, like row counts and compression ratios, for a specific table.
The list_databases tool shows all top-level schemas available in the ClickHouse cluster.
The list_tables tool retrieves the exact names of tables contained inside a specified database.
The describe_table tool extracts detailed column schemas, including vector data types, for a given table.
The execute_sql tool runs arbitrary DML, DDL, or SELECT statements against the cluster.
The get_table_stats tool retrieves internal structural states, like row counts and compression ratios, for cluster auditing.
The get_version tool identifies the specific version and binary limits of the running ClickHouse instance.
The vector_search tool identifies records based on mathematical distance metrics (cosineDistance or L2Distance) in vector embeddings.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
ClickHouse (Vector Search) MCP Server: 7 Tools for Data & Vectors
These tools allow your AI agent to manage the ClickHouse cluster—from listing schemas to running complex SQL and performing semantic vector searches.
019d7572describe table
Extracts the schema details for a table, showing column names and data types.
019d7572execute sql
Runs any SQL query (SELECT, DML, DDL) against the data cluster.
019d7572get table stats
Pulls internal structural data, like row counts and compression ratios, for a specific table.
019d7572get version
Identifies the precise version and feature set of the underlying ClickHouse instance.
019d7572list databases
Shows a list of all top-level databases available in the ClickHouse cluster.
019d7572list tables
Returns a list of all tables contained within a specified database.
019d7572vector search
Performs a high-dimensional semantic search using specified vector embeddings and distance metrics.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with ClickHouse (Vector Search), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
ClickHouse (Vector Search) MCP Server lets your AI client manage vector embeddings and run SQL queries straight from your workflow. You'll use this server to list databases, inspect schemas, and run high-speed vector searches using cosineDistance or L2Distance metrics, connecting your analytical data and vector stores to your agent without writing boilerplate code.
Your agent uses the list_databases tool to show all top-level schemas available in the ClickHouse cluster, and the list_tables tool returns the exact names of all tables inside a specified database. You can inspect any table's structure with describe_table, which pulls detailed column schemas, including specialized vector data types, for a given table.
To run custom queries, execute_sql lets your agent fire off any DML, DDL, or SELECT statement against the cluster. You can get a read on the table's health using get_table_stats, which pulls internal structural states like row counts and compression ratios for auditing. To check the cluster's setup, get_version identifies the specific version and binary limits of the running ClickHouse instance.
For semantic searching, the vector_search tool identifies records based on mathematical distance metrics—you can specify cosineDistance or L2Distance—in vector embeddings. You'll run your agent to gather context and execute the query, and the result feeds back to your agent for a final answer.
How ClickHouse (Vector Search) MCP Works
- 1 First, tell your agent to scope the task by running
list_databasesto see available schemas. - 2 Next, use
list_tablesanddescribe_tableto narrow down the exact table and check the column structure, especially the vector columns. - 3 Finally, the agent calls
execute_sqlfor standard queries orvector_searchfor semantic matching, using the validated schema.
The bottom line is that your AI agent handles the complex sequence of discovery and execution calls for you, letting you query complex data and vectors using plain language.
Who Is ClickHouse (Vector Search) MCP For?
Data Analysts, AI Developers, and Database Administrators. If you spend time writing boilerplate SQL to test a new data feature or debug a vector similarity search, this saves time. It’s for anyone who needs to query massive, structured data and vector fields without needing to jump between a query console and a chat interface.
Generates complex analytical reports and executes ad-hoc SQL queries using natural language, eliminating the need to manually verify schemas first.
Tests and debugs vector similarity searches and semantic matching in production agents without writing repetitive, boilerplate code.
Monitors table statistics, compression ratios, and cluster versions across multiple environments to audit system health.
What Changes When You Connect
- Vector Search: Instead of manually writing vector similarity queries, use the
vector_searchtool. You simply provide the target vector, and the tool handles the mathematical distance calculation (like cosineDistance) to find the most relevant records. - Schema Visibility: You don't need to guess table names. Running
list_databasesand thenlist_tablesshows you the entire data structure, letting you build confidence in your query scope. - Full SQL Control:
execute_sqllets you run any type of query—whether you're modifying data (DML) or just pulling a report (SELECT). You have full, immediate control over the data set. - Cluster Auditing: Forget logging into a separate dashboard to check health. Running
get_table_statsgives you instant metrics—row counts, total size, compression ratios—to verify data quality and system performance. - AI-Native Workflow: The server manages the complex sequence for you. Your agent knows that to query data, it must first call
list_databases, thendescribe_table, and then finallyexecute_sql. - Version Control: Use
get_versionto check the exact binary limits and capability branches (like HNSW support) of your cluster, ensuring your code runs correctly in the target environment.
Real-World Use Cases
Debugging a New Vector Field
An AI Developer needs to know if a new Array(Float32) vector column exists in the embeddings table. They don't want to guess the schema. They ask their agent to describe_table, which immediately shows the column structure, confirming the vector field is present before they write any search code.
Generating a Quarterly Performance Report
A Data Analyst needs to combine sales data with user demographics. Instead of writing a massive, complex JOIN query that might fail, they ask the agent to list_databases and then use execute_sql with natural language prompts, letting the agent construct the correct, multi-table query.
Checking Data Integrity Before Launch
A DBA suspects a table's data quality dropped. They use get_table_stats to get the current row count and compression ratio. If the stats look off, they can investigate further using describe_table to check for unexpected schema changes.
Finding Related Documents Semantically
A Product Team needs to find documents related to 'quantum computing' without knowing the exact keywords. They feed a query vector into the vector_search tool, which uses the cosineDistance metric to pull the top 5 most semantically similar records instantly.
The Tradeoffs
Running blind SQL
The user just types: 'Show me the total revenue from last quarter.' and the agent blindly runs a complex SELECT query without knowing which tables or columns to join.
→
Always start with discovery. Use list_databases to define the scope, then list_tables to select the right tables, and finally describe_table to confirm the necessary columns before calling execute_sql.
Ignoring data quality checks
The user executes a massive data migration script using execute_sql and assumes the data is fine, only to find out later that the row count was wrong or the table was corrupted.
→
Before any write operation or critical read, run get_table_stats on the target table. This verifies the current row count and compression ratio, giving you a baseline for comparison.
Trying to search vectors without context
The user attempts to run vector_search but forgets to specify the correct distance metric (e.g., cosineDistance), leading to an ambiguous or failed search.
→
When It Fits, When It Doesn't
Use this server if your core problem is accessing and querying structured data that includes high-dimensional vector embeddings. You need to run complex reports or semantic searches without writing boilerplate SQL or managing multi-step tool calls.
Don't use this if:
* You only need to interact with a simple key-value store (use a dedicated NoSQL client instead).
* Your data access pattern is purely procedural (e.g., triggering a function that performs external API calls). In that case, you'll need a dedicated workflow orchestration tool.
This tool is for data retrieval and analysis. The best practice is always: list_databases $\rightarrow$ list_tables $\rightarrow$ describe_table $\rightarrow$ get_table_stats $\rightarrow$ execute_sql or vector_search.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by ClickHouse. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 7 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Getting the right data context shouldn't feel like an archaeological dig.
Today, getting context for a complex query is a nightmare. You open your data warehouse, jump through dashboards, and spend 20 minutes clicking tabs just to confirm which table has the right column name or if the schema changed. Then, you're forced to write a massive SQL query based on assumptions.
With the ClickHouse (Vector Search) MCP Server, the process changes. Your agent doesn't guess. You ask it to `list_databases`, then `list_tables`, and finally `describe_table`. It gives you the full schema instantly. You get the necessary context, and you can execute the query without the guesswork.
ClickHouse (Vector Search) MCP Server: Query Vector Embeddings & Data
Manual data auditing is slow. You have to run separate queries just to see row counts or check if the cluster version is up to date. You also have to write completely different code paths for standard SQL versus vector similarity.
Now, you run `get_table_stats` to audit the table health, and you run `vector_search` for semantic data. You manage both standard reporting and advanced vector queries using the same conversation flow. It's a unified data interface.
Common Questions About ClickHouse (Vector Search) MCP
How do I find out what tables are available using list_tables? +
You must first run list_databases to select the correct top-level schema. Once you have the database name, the agent uses list_tables to retrieve all the tables inside that specific database.
What is the difference between execute_sql and vector_search? +
Use execute_sql for standard data operations (like counting rows or joining tables). Use vector_search when you need to find records based on mathematical similarity in high-dimensional vector embeddings.
Can I check if the table schema changed using describe_table? +
Yes. describe_table provides a complete snapshot of the column names and data types. By running it before and after a change, you can spot discrepancies in the schema.
Do I need to run get_table_stats before executing an SQL query? +
No, but it's smart. Running get_table_stats first gives you current metrics—like the row count and compression ratio—which helps confirm the table is healthy before you spend compute resources querying it.
How do I use `list_databases` to check which schemas are connected to the cluster? +
It lists all available logical arrays (databases) on the cluster. This helps you confirm which top-level schemas your AI agent can access before running specific queries.
What is the difference between `get_table_stats` and `execute_sql`? +
get_table_stats extracts internal structural data, giving you row counts and compression ratios. execute_sql runs actual commands (DML/DDL/SELECT) to modify or retrieve data.
Can I check the cluster's capability limits using `get_version`? +
Yes, get_version identifies precise cluster limits and binary support. You use this to confirm specific features, like HNSW support, are active on your instance.
How do I use `vector_search` if my vector embeddings are not in a dedicated table? +
You must first use describe_table to find the correct column schema that holds the vector embeddings. Then, you pass that column name to vector_search.
Can my agent perform high-speed vector similarity searches? +
Yes. Provide the database, table, and the vector embedding array in JSON format. The agent uses ClickHouse's native distance functions (cosine or L2) to return the closest matches, leveraging ClickHouse's industry-leading OLAP performance.
Can I execute arbitrary SQL commands directly through the agent? +
Absolutely. The 'execute_sql' tool allows you to push any valid ClickHouse SQL (DML, DDL, or SELECT) to your cluster. This is perfect for managing tables, updating records, or generating custom analytical reports on the fly.
How do I check if my ClickHouse instance supports HNSW indices? +
Ask your agent to get the version details. The agent checks your ClickHouse build and identifies exactly which capability branches are active, confirming if advanced vector features like HNSW support are available in your runtime environment.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
Anthropic
Interact with Claude models via the Anthropic Messages API — send prompts, manage batches, and monitor rate limits directly.
Synthesia
Connect your AI to Synthesia. Generate corporate AI avatar videos from text prompt, explore templates, and automatically dub existing media directly from the terminal.
Hugging Face LLM
Connect Hugging Face LLM to any AI agent via MCP.
You might also like
Coda
Enable your AI agent to manage docs, tables, formulas, and rows via the Coda API.
Lago
Manage your metering and usage-based billing with Lago — handle customers, subscriptions, plans, and events directly from your AI agent.
AI21 Studio
Unlock AI21's Jamba models and language tools for summarizing, paraphrasing, and grammar correction natively.