# Snowflake MCP

> Snowflake connects your AI agent directly into a live Snowflake Data Cloud. Use it to map complex schemas, list all databases and tables, audit compute costs, or run read-only SQL queries without leaving your IDE. It lets you talk to the data warehouse like a native developer.

## Overview
- **Category:** industry-titans
- **Price:** Free
- **Tags:** sql-warehouse, data-cloud, query-execution, schema-mapping, data-warehousing, big-data

## Description

You're connecting your AI agent straight into a live Snowflake Data Cloud. You use this server to map out complex schemas, list every table and database name, audit compute costs, or run read-only SQL queries without ever leaving your IDE. It lets you talk to the data warehouse like an actual developer.

To figure out what's running in your account, you first call `list_databases` to get a roster of all available databases. From there, you can narrow it down by calling `list_schemas`, which outputs every schema housed inside that specific database. Once you've got the right scope, you use `list_tables` to pull up a list of every table sitting within that exact schema. This sequence lets your agent walk you through the entire data structure, making sure you can pinpoint any column.

To check on the infrastructure doing all this work, you run `list_warehouses`. It gives you a clean rundown of every virtual compute warehouse and what its current running status is; that helps you track compute usage and costs. If your raw data files are loaded using stages, you use `list_stages` to retrieve names for both internal and external data storage areas.

The main action is executing code. You tell the agent to run a query with `execute_sql`. This runs SQL statements against your connected account, favoring read-only actions whenever it can. When you fire off a complex or lengthy statement, you don't just forget about it; you use `get_query_status` to check the real-time status of that asynchronous job until it’s done.

This tool set means your agent doesn't guess—it validates every command against the live engine before running it. You can ask it to run a complex query based on human input or test intricate table joins safely right inside Cursor or Claude, knowing it'll only interact with read-only data until you tell it otherwise.

## Tools

### execute_sql
Runs a SQL query against Snowflake, favoring read-only statements when possible.

### get_query_status
Checks the current status of any long-running or asynchronous SQL query.

### list_databases
Returns a list of all databases available within your Snowflake account.

### list_schemas
Lists all schemas contained within a specific database you select.

### list_stages
Retrieves names of internal or external data stages used for loading files.

### list_tables
Lists all tables that exist within a specific schema you target.

### list_warehouses
Outputs a list of virtual compute warehouses and their current running status.

## Prompt Examples

**Prompt:** 
```
List all running virtual warehouses I can access in my Snowflake account.
```

**Response:** 
```
I've called `list_warehouses` for your targeted Snowflake footprint.

**Running Clusters Detected**:
1. `COMPUTE_WH` – Size: X-Small | State: SUSPENDED
2. `ANALYTICS_WH` – Size: Large | State: STARTED
3. `ETL_LOAD_WH` – Size: Medium | State: STARTED

Would you like me to run an analytical snippet under `ANALYTICS_WH` now?
```

**Prompt:** 
```
Write a query to grab the top 5 most engaged users from our schema and execute it.
```

**Response:** 
```
No problem! I constructed this query to target exactly that and executed it against the system:
`SELECT user_id, email, activity_score FROM prod_schema.users ORDER BY activity_score DESC LIMIT 5;`

**Snowflake Output Data**:
1. alice@example.com (Score: 980)
2. john84@test.io (Score: 964)
3. m_reyes@corp.com (Score: 941)
[...]

Do you want me to format this as a markdown table or save the dataset?
```

**Prompt:** 
```
Retrieve the schema mapping for the MASTER_DB. I need to know all nested tables before doing table joints.
```

**Response:** 
```
Using `list_schemas`, `MASTER_DB` contains main partitions:
1. `PUBLIC`
2. `ANALYTICS_GOLD`

Digging into `ANALYTICS_GOLD` using `list_tables`:
- `dim_customers` (Type: BASE TABLE)
- `dim_geography` (Type: BASE TABLE)
- `fct_sales_records` (Type: BASE TABLE)
- `vw_live_revenue` (Type: VIEW)

I am ready to help build your exact JOINS. For example, `fct_sales_records.customer_id` directly joins on `dim_customers.id`.
```

## Capabilities

### Execute SQL Queries
Runs read-only SQL statements against your connected Snowflake account.

### Discover Account Structure
Lists every database, schema, and table available in the entire Snowflake environment.

### Track Compute Usage
Retrieves a list of all active virtual warehouses to check current compute costs and status.

### Monitor Long Jobs
Checks the real-time status of any lengthy, asynchronous query you've started.

### Inspect Data Layers
Lists all internal and external stages where raw data loads or files reside.

## Use Cases

### Debugging Data Flow Discrepancies
A data engineer notices a table seems empty. Instead of checking the UI, they prompt their agent: 'What stages are available for this database?'. The agent calls `list_stages` and shows the exact path where files should be landing, solving the missing data problem instantly.

### Modeling a New Data Source
An analytics engineer needs to join three tables but isn't sure which ones exist. They ask their agent to map the structure: 'List all schemas in MASTER_DB and then list the tables within ANALYTICS_GOLD.' The agent calls `list_schemas` and then `list_tables`, giving them a complete, joined reference guide for modeling.

### Performance Troubleshooting
The team is running slow queries. A software architect asks the agent to check resource usage: 'What warehouses are active right now?' The agent runs `list_warehouses`, showing a cluster that's idle but still incurring costs, allowing the architect to shut it down.

### Validating ETL Pipeline Integrity
A developer just loaded a massive dataset. They need confirmation of where the raw files landed. They use `list_stages` to confirm the exact internal stage path, verifying that the data source was ingested correctly before proceeding with transformations.

## Benefits

- See the full data structure by using `list_databases` followed by `list_schemas`. You map out the entire data landscape before writing a single line of code.
- Audit compute costs instantly. Use `list_warehouses` to see which virtual clusters are running, helping you manage spending without manual dashboard checks.
- Test complex joins safely. You can prompt your agent with an SQL query and let it run against the live engine using `execute_sql`, validating results immediately.
- Keep track of big jobs. If a data pipeline runs for hours, don't worry. Use `get_query_status` to monitor its progress asynchronously.
- Validate raw file uploads. Need to know where external files landed? Run `list_stages` to see all internal and external staging areas.

## How It Works

The bottom line is, you get full data visibility and query execution right inside your coding environment, ditching the browser tabs.

1. Subscribe to the server, then provide your specific Snowflake Account identifier (e.g., `abc123.us-east-1`).
2. Inject your necessary Snowflake OAuth token or JWT authentication key pair.
3. Ask your AI client (Claude or Cursor) to look into a schema using natural language prompts.

## Frequently Asked Questions

**How do I find all databases using list_databases?**
It's straightforward: just ask your agent to run `list_databases`. It returns a clean, complete list of every database in your account. This is the starting point for any data mapping effort.

**If I forget a table name, can I use list_tables?**
Yes. If you know the schema but not the exact table, prompt the agent to run `list_tables` within that specific schema. It gives you all available tables so you don't have to guess.

**What if my query is too complex for execute_sql?**
If your query involves multiple steps or runs for a long time, run it first with `execute_sql` and then follow up by asking the agent to check its status using `get_query_status`. This keeps you informed.

**How do I see if my compute costs are getting too high?**
Run `list_warehouses`. This tool lists all virtual warehouses, letting you immediately see which clusters are running or suspended. You can use this to manage and control your spending.

**What credentials do I need for my AI client to use `execute_sql`?**
You must provide a valid Snowflake OAuth token or JWT pair. The system needs explicit access rights and your account identifier (e.g., abc123.us-east-1) to run any queries against the live data.

**When should I use `list_stages` instead of just querying a table?**
Use `list_stages` when you need to locate external or internal cloud storage pointers. This tool shows where raw, unmodeled source files actually sit before they enter the database structure.

**If my query takes hours, how do I track its progress using `get_query_status`?**
After running a large job with `execute_sql`, use `get_query_status` and provide the unique query ID. This retrieves real-time status updates so you don't have to wait for a timeout.

**How does `list_schemas` help me validate column names before I write a JOIN?**
`list_schemas` helps you map the entire structure, allowing your agent to traverse databases and identify specific schemas. This confirms that related tables exist in proximity, making joins safer.

**Can my AI actually read the raw table rows via an execute statement?**
Yes. When the AI uses `execute_sql` with something like `SELECT * FROM schema.users LIMIT 10`, the MCP integration parses the exact row outputs. The LLM consumes the tabular data back into context so you can converse naturally about the dataset findings.

**Is it completely safe to give AI power over a Data Warehouse?**
Safety stems from principle of least privilege. Supply a Snowflake Token tied strictly to a read-only role or a heavily scoped down service account. This allows the AI to navigate schemas and extract data without risking destructive schema mutations like DROPs or DELETEs.

**Can it search for a column name if I don't know the exact schema?**
Yes! Tell your agent: 'Find which table in the SALES_DB database has a column named customer_churn_score'. Due to its autonomous workflow, the bot will pull schemas, subsequently loop over `list_tables`, query Snowflake’s internal information_schema if necessary, and deduce it entirely for you.