Databricks MCP. Audit your entire lakehouse from your AI agent.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
Databricks MCP Server. Monitor and manage your entire lakehouse environment from any AI client. Get cluster details, track job runs, list Unity Catalog schemas, and audit SQL warehouses using natural conversation.
Control your data platform's operational state without leaving your agent.
What your AI agents can do
Get cluster
Retrieves specific operational details for a single compute cluster.
Get me
Fetches the profile details and permissions for the currently authenticated user or service principal.
List catalogs
Lists all root catalogs available in Unity Catalog.
List all available compute clusters and retrieve detailed metrics for specific clusters.
List configured jobs and retrieve historical run data to check the status of data pipelines.
List all Unity Catalog catalogs, schemas, and SQL warehouses to understand data location.
Fetch profile details for the connected user or service principal to verify active permissions.
Retrieve chronological logs of job runs to pinpoint failures in complex data workflows.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
019d7581get cluster
Retrieves specific operational details for a single compute cluster.
019d7581get me
Fetches the profile details and permissions for the currently authenticated user or service principal.
019d7581list catalogs
Lists all root catalogs available in Unity Catalog.
019d7581list clusters
Retrieves a list of all compute clusters across the workspace.
019d7581list job runs
Lists the most recent job executions and their statuses from Databricks.
019d7581list jobs
Retrieves a list of all configured data workflows and jobs.
019d7581list schemas
Lists all schemas (databases) within a specified Unity Catalog catalog.
019d7581list warehouses
Lists the active SQL Serverless warehouses configured in your workspace.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Databricks, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
Databricks MCP Server - Manage Clusters and Job Runs
You've got a whole data lakehouse setup, right? This server lets your AI client manage your whole operation from the command line. You don't have to leave your agent just to check on your data platform's operational state.
Audit Compute Clusters
You can use list_clusters to pull a list of every compute cluster running in the workspace. You'll then use get_cluster to pull specific operational details for any single cluster. Monitor Data Pipelines
Need to check on your data flows? You've got list_jobs to see every configured data workflow and job. For tracking actual runs, you can use list_job_runs to get the most recent job executions and their statuses. Map Data Structures
You can use list_catalogs to list all root catalogs available in Unity Catalog. You can then use list_schemas to list all schemas—those are your databases—within a specific Unity Catalog catalog. You can also run list_warehouses to list the active SQL Serverless warehouses set up in your workspace.
Check User Permissions
You just use get_me to fetch the profile details and permissions for the user or service principal connected to the system.
View Job History
When a job fails, you can use list_job_runs to get chronological logs of job runs, pinpointing exactly where the failure happened in your complex data workflows.
How Databricks MCP Works
- 1 Subscribe to the Databricks MCP Server and provide your Host URL and Personal Access Token (PAT).
- 2 Your AI client sends a request (e.g., 'List all schemas in the main catalog') to the server.
- 3 The server executes the specific tool, retrieves the data from Databricks, and returns the structured result to your AI client.
The bottom line is that your AI client runs complex data platform queries using structured tools, and you get the data back without opening the Databricks UI.
Who Is Databricks MCP For?
Data Engineers who get tired of context-switching between IDEs and the Databricks UI. Analytics Engineers who need to validate data location or schema availability on the fly. Data Platform Teams needing a single place to audit workspace resources and service principal identities. MLOps Engineers tracking model training jobs.
Runs list_job_runs and get_cluster to check job health and cluster capacity without leaving their development environment.
Uses list_catalogs and list_schemas to explore where structured data lives and verify SQL warehouse availability.
Calls get_me to audit service principal identities and uses list_clusters to monitor overall workspace resource allocation.
Uses job listing tools (list_jobs) and cluster checkers (get_cluster) to track model training job status and verify compute configurations.
What Changes When You Connect
- Check cluster health instantly. Instead of navigating the cluster view, your AI client runs
get_clusterto get detailed metrics on a specific node. - Verify data location immediately. Use
list_catalogsandlist_schemasto map out the entire data schema structure without opening the Unity Catalog UI. - Audit job failures. Run
list_job_runsto see a history of job executions, instantly identifying which run failed and why. - Manage resources without leaving your flow. You can use
list_clustersto see all available compute nodes, then useget_clusterto check a specific one's status. - Know who has access. Use
get_meto confirm the exact permissions of the service principal running the job, which is critical for compliance checks.
Real-World Use Cases
Diagnosing a failed data pipeline.
The job 'Daily-Sales-ETL' fails. Instead of clicking into the job history, the user asks their agent to run list_job_runs. The agent retrieves the latest run ID and status, showing that Run 985 failed due to a cluster timeout. The user then uses get_cluster to check the cluster limits and diagnose the root cause.
Finding a specific dataset's location.
An analyst needs to find the raw customer data schema. They ask their agent to run list_catalogs. The agent returns the list, and the user follows up by asking the agent to run list_schemas against the 'main' catalog, immediately pinpointing the exact database structure.
Auditing resource usage.
The data platform team needs to confirm all active SQL endpoints. They ask the agent to run list_warehouses. The agent provides a list of all configured SQL Serverless warehouses and their active operational boundaries, allowing the team to confirm resource allocation.
Checking workspace permissions.
Before running a critical job, the MLOps engineer asks the agent to run get_me. The agent retrieves the service principal profile, confirming the necessary read/write permissions are active for the required catalog.
The Tradeoffs
Manual UI clicking
The user manually navigates the Databricks UI, clicking through the Catalog, then the Schema, then the Jobs tab, and finally copies the run ID to check the logs.
→
Ask your agent to run list_catalogs first. Then, ask it to run list_schemas for the target catalog. Finally, ask it to run list_job_runs to get the history, keeping you in the chat window.
Guessing data location
An engineer suspects the data is in 'dev' but doesn't know which catalog or schema. They waste time checking multiple locations manually.
→
Ask the agent to run list_catalogs to see all root catalogs. Then, use list_schemas to narrow down the data structures inside the most likely catalog.
Over-relying on PAT scope
A user assumes their basic PAT can see everything, leading to failed runs when the service principal lacks necessary permissions.
→
Always run get_me first. This verifies the exact profile and active permissions of the service principal before you attempt any complex read or write operations.
When It Fits, When It Doesn't
Use this if you need to audit or monitor the state of your data platform—meaning you need to know what clusters are running, what schemas exist, or if a job failed. This is for observability and compliance. Don't use this if you need to write data, modify a schema, or trigger a job manually; those actions require dedicated execution tools. If your goal is just to read metadata, this server handles it. If you're trying to build a data pipeline from scratch, you'll still need a separate workflow builder, but you can check the dependencies and history here.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Databricks. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 8 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Checking the health of your data stack shouldn't require 15 clicks.
Right now, checking if a data pipeline is healthy means jumping between the Job run tab, the Cluster view, and the Catalog browser. You're clicking through tabs, searching for run IDs, and cross-referencing which cluster was used for which job. It takes minutes and requires switching context.
With this MCP server, you simply ask your agent, 'What happened with the sales job?' The agent runs `list_job_runs` and gives you the status, including the failure reason. You get the answer and the root cause in a single chat response.
Databricks MCP Server: Know exactly what data lives where.
Before, finding a specific dataset required remembering if it was in the 'main' catalog or a separate 'sandbox' catalog, and then knowing if it was housed in a schema or a warehouse. You had to manually check `list_catalogs`, then `list_schemas`, then manually check the warehouse list.
Now, you can ask your agent to list all schemas, and it pulls that data directly from `list_schemas` and `list_catalogs`. You get a complete, structured map of your data assets instantly.
Common Questions About Databricks MCP
How do I check the status of all compute clusters using the get_cluster tool? +
You should use list_clusters first to get a list of all available clusters. Then, if you need deep metrics on one, you run get_cluster and specify the exact cluster name. This two-step process ensures you are targeting the correct node.
Can I list all the schemas in a specific catalog using the list_schemas tool? +
Yes, list_schemas handles this. You specify the catalog name and the tool pulls all databases (schemas) residing within it, giving you a complete view of the data structure.
What is the best way to check job history using list_job_runs? +
Run list_job_runs and filter the results by the job name and date range. This tool provides the chronological log IDs, allowing you to pinpoint the exact run that failed and see its status.
Does the Databricks MCP Server help me manage my user permissions? +
You use the get_me tool. This fetches the profile information for the authenticated user or service principal, confirming the exact permissions active on the workspace.
How do I check which SQL warehouses are active using the list_warehouses tool? +
The list_warehouses tool enumerates all configured SQL Serverless warehouses. You can use this to track active operational boundaries, confirming if your required endpoints are available for querying.
What information can I retrieve about my user permissions using the get_me tool? +
The get_me tool fetches profile information for the authenticated user or service principal. This lets you verify exactly what permissions are active on the workspace, which is crucial for auditing.
Can I find all configured data workflows using the list_jobs tool? +
The list_jobs tool provides a complete list of all configured jobs in your workspace. This lets you see every workflow defined and manage which data pipelines need monitoring.
How do I see the structure of my data catalogs using the list_catalogs tool? +
The list_catalogs tool lists all root catalogs within Unity Catalog. From there, you can drill down to identify exactly where your structured data resides, helping you locate the right schemas.
Can my agent check the status of a specific Databricks job run? +
Yes. Provide the 'job_id' to the 'list_job_runs' tool. The agent will retrieve the chronological history of executions, allowing you to identify successful completions or precise points of failure in your workflows.
How do I explore schemas within a specific Unity Catalog via chat? +
Use the 'list_schemas' tool and provide the catalog name. Your agent will pull the detailed databases and schemas registered inside that Unity Catalog, giving you immediate visibility into your data hierarchy.
Can I monitor the health of my Databricks clusters through the agent? +
Absolutely. The 'list_clusters' and 'get_cluster' tools allow your agent to retrieve detailed node information and operational statuses, helping you audit cluster health and capacity across your workspace.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
SmartChatAI
Manage AI bots, knowledge bases, and messaging on SmartChatAI with AI agents.
Zendesk
Manage support tickets, users, and organizations on Zendesk — the complete customer service platform for high-scale teams.
PDFMonkey
Generate dynamic PDF documents from JSON data and HTML templates with an API built for high-volume document automation.
You might also like
Chuanyun
Enterprise workflow automation and digital engine — manage business objects, forms, and approval history via AI.
Voyage AI (AI Embeddings API)
Generate high-quality text, multimodal, and contextualized embeddings, plus high-precision reranking for RAG workflows.
BannerBite
Dynamic image and video generation — generate media from templates and manage projects via AI.