# Google BigQuery MCP

> Google BigQuery connects your AI agent directly to massive data warehouses. You can run complex Standard SQL queries on petabytes of structured information without leaving your chat client. Use this MCP to inspect schemas, audit job runs, and analyze huge datasets conversationally.

## Overview
- **Category:** industry-titans
- **Price:** Free
- **Tags:** sql, data-warehouse, big-data, cloud-computing, data-pipelines, query-optimization

## Description

This MCP lets you treat your data warehouse like a giant spreadsheet that talks back. Instead of logging into the console just to run one query or check a column name, you talk to your agent, and it handles the heavy lifting against Google BigQuery. You can ask questions about customer behavior, operational metrics, or complex financial trends, and it writes, runs, and summarizes the exact Standard SQL needed.

It's like having a dedicated data analyst sitting next to you who knows every table structure and job status in your system. Need to know if last night's background pipeline finished correctly? You can list recent jobs and check the error traces instantly. This capability makes it invaluable for anyone needing quick validation against terabytes of rows.

When you connect this MCP via Vinkius, your agent gets full visibility across all your structured data—from discovering deep table column mappings to running complex aggregations over massive datasets purely through conversational prompts.

## Tools

### list_datasets
Lists every dataset available in your active GCP project.

### get_dataset
Retrieves detailed information about a single, specified BigQuery dataset.

### list_tables
Lists all the tables contained within one specific dataset.

### get_table
Gets the full metadata and schema details for any given BigQuery table.

### execute_query
Runs an explicit Standard SQL command that you specify to pull data.

### list_jobs
Lists recent execution jobs and run history within BigQuery for auditing purposes.

### get_job
Retrieves comprehensive details about a specific, completed or failed job run.

## Prompt Examples

**Prompt:** 
```
Get the table schema for `users_prod` in the `analytics` dataset.
```

**Response:** 
```
Schema fetched. `users_prod` contains 12 columns, notably `user_id` (STRING, required), `signup_timestamp` (TIMESTAMP, partitioned), and `plan_tier` (STRING). Would you like to check some sample rows?
```

**Prompt:** 
```
Find out the top 3 countries with the most signups this month in the `users` table.
```

**Response:** 
```
Executing SQL... The query completed successfully processing 12MB. The top 3 countries are: 1) United States (12,400), 2) Brazil (8,900), 3) Japan (4,150).
```

**Prompt:** 
```
Did the overnight cron job compute successfully or did it fail?
```

**Response:** 
```
Listing recent jobs. Job `cron_transform_01` (ID: 11a0ab3-xxx) failed globally at 03:00AM. Root cause trace indicates native `Unrecognized name: user_account_id` syntax failure halting workflow explicitly mapped.
```

## Capabilities

### Querying Structured Data
The agent executes explicit Standard SQL commands against your BigQuery dataset, allowing you to extract precise data subsets.

### Inspecting Database Structure
You can get detailed metadata on any specific dataset or table, including column types and partitioning logic.

### Listing Dataset Contents
The agent lists all active datasets within your GCP project so you know exactly where to start looking for data.

### Auditing Job Performance
You can list recent query jobs and retrieve detailed reports on job runs, including processing bytes and failure reasons.

## Use Cases

### Checking Pipeline Health
A Data Engineer notices a scheduled report is missing data. Instead of manually checking logs, they ask their agent to use `list_jobs` and check the most recent job's status via `get_job`. The agent immediately flags that the job failed due to an unrecognized column name.

### Ad-Hoc Market Analysis
A Marketing Analyst needs a quick report on customer acquisition channels. They ask their agent, which uses `execute_query`, to write and run complex SQL joining multiple large tables to calculate the top revenue sources for the month.

### Schema Discovery
A Backend Developer inherits a new database schema. Rather than spending hours clicking through documentation, they ask their agent to use `get_table` on the main user table, instantly providing the column mappings and data types needed for integration.

### Project Mapping
A consultant is onboarding to a new client's data environment. They use `list_datasets` to get an overview of all available logical groupings in the project, quickly mapping out where different domains (finance, marketing, operations) store their records.

## Benefits

- Stop jumping between tabs. Instead of leaving your chat client to validate data constraints or summarize daily logs, you can use the agent's job listing tool to audit workloads right where you are working.
- No more guessing column names. Use `get_table` to pull precise schema details for any table, confirming types and clustering logic before writing a single line of SQL.
- Turn conversations into data. You can ask high-level questions (like 'What were the top 3 signups?') and let the agent translate that into optimized Standard SQL using `execute_query`.
- Audit pipelines easily. If you suspect an overnight cron job failed, use `list_jobs` and then `get_job` to read the root cause trace directly—no need to open the GCP console.
- Understand your data structure instantly. The agent can traverse nested datasets using `list_datasets`, mapping out the entire logical topology of your project.

## How It Works

The bottom line is that you interact with your massive database using natural language prompts instead of complex console commands.

1. Subscribe to this MCP and provide your GCP Project ID along with an active OAuth or Service Account Token.
2. Your AI client authenticates the connection and establishes access across your specified data warehouse environment.
3. You prompt your agent conversationally (e.g., 'Show me the top 3 countries...'), and it handles running the necessary Standard SQL query, presenting only the result.

## Frequently Asked Questions

**Can I query my data from different datasets using Google BigQuery MCP?**
Yes, the agent allows you to reference multiple datasets in a single prompt. As long as you have appropriate permissions and know the names, it can write cross-dataset queries for you.

**What if my SQL query is too complex? Will Google BigQuery MCP handle it?**
The agent handles writing and running Standard SQL. You just need to describe the desired outcome in plain language, and it writes the optimized code for execution.

**How do I check if a specific column exists using Google BigQuery MCP?**
First, use `get_table` on the relevant dataset. This tool provides the full schema map, letting you confirm every column name and its data type.

**Does this MCP help me troubleshoot failed data pipelines?**
Absolutely. Use `list_jobs` to see recent activity, then use `get_job` on the problematic ID to read the error trace directly, pinpointing the syntax or data issue.

**Is this MCP only for reading data?**
No. While it focuses heavily on querying and auditing, its structured nature allows you to confirm data integrity before building out write processes in your application's code.