ClickHouse (Vector Search) MCP. Query Vector Embeddings and Data via Natural Language

Q: How do I find out what tables are available using listtables?

You must first run listdatabases to select the correct top-level schema. Once you have the database name, the agent uses listtables to retrieve all the tables inside that specific database.

Q: What is the difference between executesql and vectorsearch?

Use executesql for standard data operations (like counting rows or joining tables). Use vectorsearch when you need to find records based on mathematical similarity in high-dimensional vector embeddings.

Q: Can I check if the table schema changed using describetable?

Yes. describetable provides a complete snapshot of the column names and data types. By running it before and after a change, you can spot discrepancies in the schema.

Q: Do I need to run gettablestats before executing an SQL query?

No, but it's smart. Running gettablestats first gives you current metrics—like the row count and compression ratio—which helps confirm the table is healthy before you spend compute resources querying it.

Q: What is the difference between gettablestats and executesql?

gettablestats extracts internal structural data, giving you row counts and compression ratios. executesql runs actual commands (DML/DDL/SELECT) to modify or retrieve data.

Q: Can I check the cluster's capability limits using getversion?

Yes, getversion identifies precise cluster limits and binary support. You use this to confirm specific features, like HNSW support, are active on your instance.

Q: How do I use vectorsearch if my vector embeddings are not in a dedicated table?

You must first use describetable to find the correct column schema that holds the vector embeddings. Then, you pass that column name to vectorsearch.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Just plug in your AI agents and start using Vinkius.

ClickHouse (Vector Search) MCP Server lets you manage vector embeddings and run SQL queries directly from your AI client. List databases, inspect schemas, and perform high-speed vector searches using cosineDistance or L2Distance metrics.

It connects your analytical data and vector stores to your workflow without boilerplate code.

What your AI agents can do

Describe table

Extracts the schema details for a table, showing column names and data types.

Execute sql

Runs any SQL query (SELECT, DML, DDL) against the data cluster.

Get table stats

Pulls internal structural data, like row counts and compression ratios, for a specific table.

+ 4 more capabilities included

List all available databases

The list_databases tool shows all top-level schemas available in the ClickHouse cluster.

List all tables within a database

The list_tables tool retrieves the exact names of tables contained inside a specified database.

Inspect table structure

The describe_table tool extracts detailed column schemas, including vector data types, for a given table.

Execute custom SQL queries

The execute_sql tool runs arbitrary DML, DDL, or SELECT statements against the cluster.

Get table health statistics

The get_table_stats tool retrieves internal structural states, like row counts and compression ratios, for cluster auditing.

Check cluster versions and limits

The get_version tool identifies the specific version and binary limits of the running ClickHouse instance.

Perform semantic vector search

The vector_search tool identifies records based on mathematical distance metrics (cosineDistance or L2Distance) in vector embeddings.

Ask AI about this MCP

Ask ChatGPT

Ask Claude

Ask Perplexity

Supported MCP Clients

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

+ other MCP clients

Free for Subscribers

Waiting for input…

AI Agent

ClickHouse (Vector Search) MCP Server: 7 Tools for Data & Vectors

These tools allow your AI agent to manage the ClickHouse cluster—from listing schemas to running complex SQL and performing semantic vector searches.

describe019d7572

describe table

Extracts the schema details for a table, showing column names and data types.

execute019d7572

execute sql

Runs any SQL query (SELECT, DML, DDL) against the data cluster.

get019d7572

get table stats

Pulls internal structural data, like row counts and compression ratios, for a specific table.

get019d7572

get version

Identifies the precise version and feature set of the underlying ClickHouse instance.

list019d7572

list databases

Shows a list of all top-level databases available in the ClickHouse cluster.

list019d7572

list tables

Returns a list of all tables contained within a specified database.

vector019d7572

vector search

Performs a high-dimensional semantic search using specified vector embeddings and distance metrics.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with ClickHouse (Vector Search), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 4,700+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

What you can do with this MCP connector

ClickHouse (Vector Search) MCP Server lets your AI client manage vector embeddings and run SQL queries straight from your workflow. You'll use this server to list databases, inspect schemas, and run high-speed vector searches using cosineDistance or L2Distance metrics, connecting your analytical data and vector stores to your agent without writing boilerplate code.

Your agent uses the list_databases tool to show all top-level schemas available in the ClickHouse cluster, and the list_tables tool returns the exact names of all tables inside a specified database. You can inspect any table's structure with describe_table, which pulls detailed column schemas, including specialized vector data types, for a given table.

To run custom queries, execute_sql lets your agent fire off any DML, DDL, or SELECT statement against the cluster. You can get a read on the table's health using get_table_stats, which pulls internal structural states like row counts and compression ratios for auditing. To check the cluster's setup, get_version identifies the specific version and binary limits of the running ClickHouse instance.

For semantic searching, the vector_search tool identifies records based on mathematical distance metrics—you can specify cosineDistance or L2Distance—in vector embeddings. You'll run your agent to gather context and execute the query, and the result feeds back to your agent for a final answer.

How ClickHouse (Vector Search) MCP Works

1 First, tell your agent to scope the task by running list_databases to see available schemas.
2 Next, use list_tables and describe_table to narrow down the exact table and check the column structure, especially the vector columns.
3 Finally, the agent calls execute_sql for standard queries or vector_search for semantic matching, using the validated schema.

The bottom line is that your AI agent handles the complex sequence of discovery and execution calls for you, letting you query complex data and vectors using plain language.

Who Is ClickHouse (Vector Search) MCP For?

Data Analysts, AI Developers, and Database Administrators. If you spend time writing boilerplate SQL to test a new data feature or debug a vector similarity search, this saves time. It’s for anyone who needs to query massive, structured data and vector fields without needing to jump between a query console and a chat interface.

Data Analyst

Generates complex analytical reports and executes ad-hoc SQL queries using natural language, eliminating the need to manually verify schemas first.

AI Developer

Tests and debugs vector similarity searches and semantic matching in production agents without writing repetitive, boilerplate code.

Database Administrator

Monitors table statistics, compression ratios, and cluster versions across multiple environments to audit system health.

What Changes When You Connect

Vector Search: Instead of manually writing vector similarity queries, use the vector_search tool. You simply provide the target vector, and the tool handles the mathematical distance calculation (like cosineDistance) to find the most relevant records.
Schema Visibility: You don't need to guess table names. Running list_databases and then list_tables shows you the entire data structure, letting you build confidence in your query scope.
Full SQL Control: execute_sql lets you run any type of query—whether you're modifying data (DML) or just pulling a report (SELECT). You have full, immediate control over the data set.
Cluster Auditing: Forget logging into a separate dashboard to check health. Running get_table_stats gives you instant metrics—row counts, total size, compression ratios—to verify data quality and system performance.
AI-Native Workflow: The server manages the complex sequence for you. Your agent knows that to query data, it must first call list_databases, then describe_table, and then finally execute_sql.
Version Control: Use get_version to check the exact binary limits and capability branches (like HNSW support) of your cluster, ensuring your code runs correctly in the target environment.

Real-World Use Cases

Debugging a New Vector Field

An AI Developer needs to know if a new Array(Float32) vector column exists in the embeddings table. They don't want to guess the schema. They ask their agent to describe_table, which immediately shows the column structure, confirming the vector field is present before they write any search code.

Generating a Quarterly Performance Report

A Data Analyst needs to combine sales data with user demographics. Instead of writing a massive, complex JOIN query that might fail, they ask the agent to list_databases and then use execute_sql with natural language prompts, letting the agent construct the correct, multi-table query.

Checking Data Integrity Before Launch

A DBA suspects a table's data quality dropped. They use get_table_stats to get the current row count and compression ratio. If the stats look off, they can investigate further using describe_table to check for unexpected schema changes.

Finding Related Documents Semantically

A Product Team needs to find documents related to 'quantum computing' without knowing the exact keywords. They feed a query vector into the vector_search tool, which uses the cosineDistance metric to pull the top 5 most semantically similar records instantly.

The Tradeoffs

Running blind SQL

The user just types: 'Show me the total revenue from last quarter.' and the agent blindly runs a complex SELECT query without knowing which tables or columns to join.

→ Always start with discovery. Use list_databases to define the scope, then list_tables to select the right tables, and finally describe_table to confirm the necessary columns before calling execute_sql.

Ignoring data quality checks

The user executes a massive data migration script using execute_sql and assumes the data is fine, only to find out later that the row count was wrong or the table was corrupted.

→ Before any write operation or critical read, run get_table_stats on the target table. This verifies the current row count and compression ratio, giving you a baseline for comparison.

Trying to search vectors without context

The user attempts to run vector_search but forgets to specify the correct distance metric (e.g., cosineDistance), leading to an ambiguous or failed search.

→

When It Fits, When It Doesn't

Use this server if your core problem is accessing and querying structured data that includes high-dimensional vector embeddings. You need to run complex reports or semantic searches without writing boilerplate SQL or managing multi-step tool calls.

Don't use this if:
* You only need to interact with a simple key-value store (use a dedicated NoSQL client instead).
* Your data access pattern is purely procedural (e.g., triggering a function that performs external API calls). In that case, you'll need a dedicated workflow orchestration tool.

This tool is for data retrieval and analysis. The best practice is always: list_databases $\rightarrow$ list_tables $\rightarrow$ describe_table $\rightarrow$ get_table_stats $\rightarrow$ execute_sql or vector_search.

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by ClickHouse. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 7 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

describe_table execute_sql get_table_stats get_version list_databases list_tables vector_search

Getting the right data context shouldn't feel like an archaeological dig.

Today, getting context for a complex query is a nightmare. You open your data warehouse, jump through dashboards, and spend 20 minutes clicking tabs just to confirm which table has the right column name or if the schema changed. Then, you're forced to write a massive SQL query based on assumptions.

With the ClickHouse (Vector Search) MCP Server, the process changes. Your agent doesn't guess. You ask it to `list_databases`, then `list_tables`, and finally `describe_table`. It gives you the full schema instantly. You get the necessary context, and you can execute the query without the guesswork.

ClickHouse (Vector Search) MCP Server: Query Vector Embeddings & Data

Manual data auditing is slow. You have to run separate queries just to see row counts or check if the cluster version is up to date. You also have to write completely different code paths for standard SQL versus vector similarity.

Now, you run `get_table_stats` to audit the table health, and you run `vector_search` for semantic data. You manage both standard reporting and advanced vector queries using the same conversation flow. It's a unified data interface.

Common Questions About ClickHouse (Vector Search) MCP

How do I find out what tables are available using list_tables? +

You must first run list_databases to select the correct top-level schema. Once you have the database name, the agent uses list_tables to retrieve all the tables inside that specific database.

What is the difference between execute_sql and vector_search? +

Use execute_sql for standard data operations (like counting rows or joining tables). Use vector_search when you need to find records based on mathematical similarity in high-dimensional vector embeddings.

Can I check if the table schema changed using describe_table? +

Yes. describe_table provides a complete snapshot of the column names and data types. By running it before and after a change, you can spot discrepancies in the schema.

Do I need to run get_table_stats before executing an SQL query? +

No, but it's smart. Running get_table_stats first gives you current metrics—like the row count and compression ratio—which helps confirm the table is healthy before you spend compute resources querying it.

How do I use `list_databases` to check which schemas are connected to the cluster? +

It lists all available logical arrays (databases) on the cluster. This helps you confirm which top-level schemas your AI agent can access before running specific queries.

What is the difference between `get_table_stats` and `execute_sql`? +

get_table_stats extracts internal structural data, giving you row counts and compression ratios. execute_sql runs actual commands (DML/DDL/SELECT) to modify or retrieve data.

Can I check the cluster's capability limits using `get_version`? +

Yes, get_version identifies precise cluster limits and binary support. You use this to confirm specific features, like HNSW support, are active on your instance.

How do I use `vector_search` if my vector embeddings are not in a dedicated table? +

You must first use describe_table to find the correct column schema that holds the vector embeddings. Then, you pass that column name to vector_search.

Can my agent perform high-speed vector similarity searches? +

Yes. Provide the database, table, and the vector embedding array in JSON format. The agent uses ClickHouse's native distance functions (cosine or L2) to return the closest matches, leveraging ClickHouse's industry-leading OLAP performance.

Can I execute arbitrary SQL commands directly through the agent? +

Absolutely. The 'execute_sql' tool allows you to push any valid ClickHouse SQL (DML, DDL, or SELECT) to your cluster. This is perfect for managing tables, updating records, or generating custom analytical reports on the fly.

How do I check if my ClickHouse instance supports HNSW indices? +

Ask your agent to get the version details. The agent checks your ClickHouse build and identifies exactly which capability branches are active, confirming if advanced vector features like HNSW support are available in your runtime environment.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript