# Azure Synapse Analytics MCP for AI Agents MCP

> Azure Synapse Analytics MCP gives your AI agent full visibility into complex enterprise data workflows. You can monitor compute pools, trace pipelines, and audit every dataset or linked service within Azure Synapse using simple natural conversation.

## Overview
- **Category:** industry-titans
- **Price:** Free
- **Tags:** data-warehousing, big-data, spark-pools, sql-pools, pipeline-orchestration, data-integration

## Description

You're dealing with massive analytics infrastructure in Azure Synapse—pools of data, dozens of pipelines, and critical connections to external systems. Manually auditing this stuff is a nightmare; you spend hours clicking through dashboards just to find out why an ETL job failed or what datasets are linked elsewhere. This MCP gives your AI client direct access to the entire Synapse workspace, letting you take full control of your data integration limits using nothing but plain conversation.

Instead of jumping between the Azure portal and running manual queries, you talk to your agent, and it tells you exactly what's going on with everything. Need to check if a Spark pool is provisioned correctly? Ask. Want to map out all the steps in a data movement workflow? It does that. This capability lets Data Engineers debug failed pipelines and Cloud Ops teams inspect compute scaling thresholds without leaving their usual IDE. By connecting this MCP via Vinkius, you bring enterprise-grade Synapse governance straight into your daily coding flow.

## Tools

### list_pipelines
Lists every data integration pipeline defined in Azure Synapse Analytics.

### get_pipeline
Retrieves the full, detailed definition for a specific Azure Synapse pipeline.

### list_notebooks
Shows all available Apache Spark analytic notebooks in the workspace.

### list_spark_pools
Lists pre-provisioned clusters used for Apache Spark analytics computation.

### list_sql_pools
Shows both dedicated and serverless SQL analytical pools in Synapse.

### list_datasets
Lists all explicit dataset targets configured within Azure Synapse.

### list_linked_services
Retrieves a list of every external service dependency linked to the workspace (e.g., Key Vaults).

## Prompt Examples

**Prompt:** 
```
Show me the full connection path for 'BillingDataSync'.
```

**Response:** 
```
**🔎 Synapse Data Flow Audit**

I traced the flow for your BillingDataSync job. It involves 3 major steps:

1.  **Source:** Reads from the `Financial_Staging` dataset.
2.  **Process:** Runs through a dedicated SQL Pool cluster (ID: `SQL-PROD-04`).
3.  **Target:** Writes to an external Blob Storage location, linked via the `AzureKeyVaultProd` service. 

The last step requires the token in 'StagingKeyVault' to be updated before it can run.
```

**Prompt:** 
```
What datasets are impacted by the Finance department?
```

**Response:** 
```
I found 5 primary datasets associated with finance operations:

*   `Dataset_GL`: Contains general ledger data.
*   `Dataset_AR`: Accounts Receivable records. (Last updated: yesterday)
*   `Dataset_AP`: Accounts Payable manifests.
*   `Dataset_Tax`: Quarterly tax filings.
*   `Dataset_Audit`: Internal audit trail logs.

You can view the full schema details for any of these by asking me to inspect them.
```

**Prompt:** 
```
List all available Spark notebooks and their owners.
```

**Response:** 
```
You have 15 Apache Spark Notebooks saved. Here’s a quick breakdown:

| Notebook Name | Primary Owner | Last Run Date |
| :--- | :--- | :--- |
| MarketingML_v3 | John D. | 2026-06-10 |
| ETL_Core_Sync | Ops Team | 2026-06-12 |
| ReportGen_Q2 | Data Science | 2026-05-28 |

The notebook `ETL_Core_Sync` is flagged as the primary orchestrator for your core data pipelines.
```

## Capabilities

### List all data integration pipelines
View a complete list of every Azure Synapse Analytics data movement pipeline.

### Inspect specific data pipelines
Get the precise definition and parameters for any individual Azure Synapse pipeline you identify.

### List Spark analytics notebooks
Retrieve a list of all Apache Spark analytic notebooks stored within your workspace.

### Check compute pools status
See which dedicated or serverless SQL Analytics pools and active Apache Spark clusters are currently provisioned.

### List explicit datasets targets
Audit all defined storage mappings that shape static or dynamic data structures within Synapse.

### Map external dependencies
Identify and review every linked service, showing which endpoints reference Key Vaults or Blob Storages.

## Use Cases

### Debugging a broken ETL job
A Data Engineer discovers an ELT routine failed overnight. Instead of opening the portal, they ask their agent to list all data integration pipelines and then use `get_pipeline` on the failing one. The agent immediately points out which specific step has mismatched target parameters.

### Evaluating new ML model inputs
A Data Scientist needs to know if a new feature set is available for testing. They ask their agent to list all datasets, and the agent provides the full list of explicitly defined storage mappings, allowing the scientist to confirm variable boundaries instantly.

### Scaling infrastructure after peak load
Cloud Ops needs to report on current resource usage. They ask their agent to check compute pools, and the MCP runs `list_sql_pools` and `list_spark_pools`, giving them a real-time count of both dedicated and serverless capacity.

### Compliance audit of data connections
A compliance officer needs to verify all external system links. They ask their agent to list linked services, which executes `list_linked_services` and confirms that sensitive endpoints like Key Vaults are correctly referenced across the whole architecture.

## Benefits

- Trace failed data movements: Use the `get_pipeline` tool to instantly dissect a specific pipeline's definition, showing you exactly which steps broke down.
- Know your compute limits: Listing both dedicated and serverless SQL pools via `list_sql_pools` gives Cloud Ops immediate visibility into resource capacity for billing checks.
- Quickly assess data scope: The `list_datasets` tool lets Data Scientists survey every defined storage mapping, helping them evaluate variables before writing a single line of code.
- Manage compute resources: Running `list_spark_pools` tells you exactly what Spark clusters are provisioned, letting you decide if scaling up or down is necessary for the next big run.
- Audit external connections: By calling `list_linked_services`, you immediately see all critical endpoints—like Key Vaults and Blob Storages—that your system relies on.

## How It Works

The bottom line is that you treat your entire Synapse environment—its pools, pipelines, and connections—as a searchable knowledge base right inside your AI client.

1. Subscribe to this MCP on Vinkius and provide your Azure Synapse Workspace URL along with an active Access Token.
2. Connect your preferred AI client (Claude, Cursor, etc.) to the MCP. The agent authenticates against the workspace using your credentials.
3. Start asking complex questions in natural language, like 'Show me all datasets linked to the HR schema.' Your agent then executes the necessary API calls and presents the structured data.

## Frequently Asked Questions

**How does Azure Synapse Analytics MCP help me trace data movement?**
This MCP lets you audit complex data flows by listing and inspecting every single pipeline. You can get the full definition of a workflow, telling you exactly what happens from source to target, which is critical for debugging.

**Can this MCP check my compute resource availability?**
Yes, it gives you visibility into both dedicated and serverless SQL pools, plus your Apache Spark clusters. You can quickly see if the resources you need are provisioned and available before starting a job.

**I'm not sure where my data comes from; what should I check?**
Start by asking to list all linked services. This tool shows every external dependency—like Key Vaults or Blob Storages—that your Synapse environment is relying on, giving you a map of its connections.

**Is this MCP useful for data governance and compliance?**
Absolutely. By allowing you to list all datasets and audit linked services, it provides the necessary visibility to prove where sensitive data lives and what external systems reference it for compliance audits.

**Can I use this MCP in my IDE while coding?**
Yes, connecting this via your AI client means you don't have to switch tabs or open the cloud console. You can audit and debug Synapse components right from your familiar coding environment.