# Starburst MCP

> Starburst MCP connects your AI client directly to enterprise federated data lakes. It lets you run complex SQL queries against diverse sources like Snowflake and S3, check schemas, and manage access roles—all using natural conversation. You query massive, distributed datasets without ever leaving your chat window or needing multiple database connection tools.

## Overview
- **Category:** industry-titans
- **Price:** Free
- **Tags:** data-lake, federated-query, trino, data-engineering, sql-analytics, data-governance

## Description

This MCP brings the power of enterprise data analytics into your conversational AI workflow. Instead of opening ten different dashboard tabs or writing boilerplate SQL connection scripts for every source, you talk to your agent and ask questions about your combined data lakes. Your client uses this MCP to figure out which sources are connected (like S3 or Snowflake) and lets you query them as if they were one giant database. You can run complex queries against the entire system, check what schemas exist across different departments, and even verify who has access to sensitive information. It’s about making data governance and advanced querying feel natural. When working with other enterprise tools, Vinkius makes sure this MCP is available alongside thousands of others, so your AI client never gets stuck needing a new connection.

## Tools

### get_query_details
Retrieves specific information about a particular SQL query you ran.

### list_catalogs
Lists all the main data catalogs available across your entire Starburst network.

### list_data_products
Lists every published, pre-packaged analytical dataset ready for consumption.

### list_domains
Shows the various domains that organize your data products.

### list_queries
Retrieves a history of recent SQL queries executed in the system cluster.

### list_roles
Displays all security roles defined across your organization's data environment.

## Prompt Examples

**Prompt:** 
```
List all active operational catalogs across the current data lake instance, and fetch the underlying schematics of any source containing the designation 'finance' in its structure.
```

**Response:** 
```
Parsing connection parameters effectively iterating across federated network protocols successfully maintaining persistent host authentication securely.

**Catalogs Displayed**: `hive`, `postgres_ops`, `finance_db`.

Extracting schematics specifically querying `finance_db`...
**Schemas detected safely (`list_schemas`)**:
- `public`
- `payroll`
- `audit_logs`

Would you like me to execute a precise subquery mapping tables securely localized within `audit_logs` specifically utilizing advanced conversational routines accurately?
```

**Prompt:** 
```
Execute a query to retrieve the top 10 rows from the 'customer_analytics' table located in our 'production_hive' catalog.
```

**Response:** 
```
Executing federated SQL query seamlessly...

**Query Results (`execute_query`)**:
The top 10 rows from `production_hive.customer_analytics` have been successfully retrieved. The dataset includes fields such as `customer_id`, `purchase_history`, and `engagement_score`. Would you like me to aggregate this data safely?
```

**Prompt:** 
```
List all registered data products across the Starburst network and check current role assignments to ensure proper access.
```

**Response:** 
```
Analyzing data products and roles securely within the Starburst ecosystem...

**Data Products (`list_data_products`)**:
- `Marketing Insights Dashboard`
- `Global Sales Summaries`

**Role Governance (`list_roles`)**:
Verified active assignments for `analyst_tier_1` and `data_governance_admin`. Access privileges align securely with enterprise standards.
```

## Capabilities

### Run complex SQL queries
You execute advanced SQL commands against massive data sources and receive structured results directly.

### Map connected databases
The system lists all the major data catalogs attached to your network, showing you where your data lives.

### Explore table structures
You drill down into specific databases to see exactly what schemas and tables are available for querying.

### Identify published datasets
The MCP lists all the pre-approved, structured data products ready for analysis across your enterprise.

### Check user permissions
You verify who has access to what by listing security roles and checking current assignments.

## Use Cases

### Finding the source of truth for sales metrics
A Data Analyst needs to compare sales figures from the production system (Snowflake) against archived records (S3). Instead of writing a massive script with three connection points, they ask their agent. The MCP uses `list_catalogs` and then runs an `execute_query` across both sources, giving them one unified result set.

### Auditing data access for compliance
A Governance Manager needs to prove that only the Finance team can view salary data. They prompt the agent to run `list_roles`, verifying that the 'analyst' role lacks permission, and then cross-reference this with active assignments.

### Quickly diagnosing a broken report
A Data Engineer notices a dashboard is failing. They ask their agent to run `list_queries` to check recent failures, or use `get_query_details` to see exactly what parameters caused the failure in the last run.

### Preparing for a new feature launch
The team needs a dataset combining marketing and sales data. They first use `list_data_products` to identify the existing components, then ask the agent to construct an `execute_query` that links them together.

## Benefits

- You get immediate visibility into your entire data landscape. Instead of manually checking multiple systems, running `list_catalogs` shows all connected sources in one go.
- Complex reporting becomes simple talking. You write a prompt like 'top 10 customer records' and the agent executes it instantly using `execute_query`, getting structured results back.
- Data governance is simplified. Need to know who can see payroll data? Use `list_roles` to review security assignments without logging into an admin portal.
- Never get lost in your schema again. You can use `list_schemas` to drill down and map out exactly what tables exist inside a specific database structure.
- Discover approved datasets easily. Instead of guessing which dataset is correct, run `list_data_products` to see every published data product ready for analysis.

## How It Works

The bottom line is that this MCP turns complex, multi-step database interactions into simple conversation prompts.

1. First, you install the Starburst MCP connector, linking it securely to your active AI client.
2. Next, in the MCP settings, you provide your STARBURST_HOST and STARBURST_TOKEN to establish a persistent connection session.
3. Finally, you just ask your agent: 'Show me the top 10 rows from customer analytics.' The MCP handles the rest.

## Frequently Asked Questions

**How does Starburst MCP handle multiple database types?**
The MCP is designed to query federated data lakes. It connects to diverse sources like Snowflake and S3, allowing you to run a single query against all of them.

**Can I see which roles exist using Starburst MCP?**
Yes, running `list_roles` allows the agent to display every security role defined in your organization's data environment for auditing purposes.

**What is the difference between list_catalogs and list_schemas?**
Using `list_catalogs` shows the highest level of grouping (the entire database instance), while `list_schemas` lets you drill down to see the specific groups of tables within one catalog.

**Does Starburst MCP help with data discovery?**
Absolutely. By listing available data products using `list_data_products`, it helps you find pre-approved, curated datasets without knowing their exact location or schema name.

**What is the best way to check query history with Starburst MCP?**
You use the `list_queries` tool. This lets your agent retrieve a clean record of recent SQL queries executed in the cluster for review.