# Dagster MCP for AI Agents MCP

> Dagster connects your data orchestration platform to any AI client, letting you manage complex pipelines and track data assets using natural conversation. Instead of clicking through dashboards or writing code just to check status, you ask your agent about job runs, dependency maps, or scheduled triggers. It's full control over your entire data stack, right from the chat window.

## Overview
- **Category:** loved-by-devs
- **Price:** Free
- **Tags:** data-orchestration, data-pipelines, workflow-automation, observability, data-engineering, asset-management

## Description

Your AI client can now talk directly to your Dagster instance, giving you granular control over your data workflows without opening a dashboard. You manage everything—from listing all available jobs to checking if a critical asset is fresh—all through natural language.

For example, you can ask the agent to list every configured job schedule or trace back which assets depend on raw customer data. This means deep visibility into your entire data mesh. The system aggregates this power and makes it accessible via Vinkius, giving you a single point of control regardless of whether you use Cursor, Claude, or any other compatible AI client.

It’s less about learning a new UI and more about talking to your infrastructure the way you already talk to a teammate. You get immediate status updates on job runs, detailed logs for failures, and full audit trails for every piece of data that moves through your system.

## Tools

### list_jobs
Lists every data pipeline job defined in your Dagster deployment.

### list_runs
Retrieves a history of recent job executions to give you an overview of system activity.

### get_run
Fetches deep details and status logs for one specific, identified run ID.

### list_assets
Lists all software-defined data assets to map out physical dependencies within your project.

### list_schedules
Retrieves a list of every scheduled job, showing when they are set to run next.

### list_sensors
Lists active sensors that wait for external events before triggering a pipeline execution.

## Prompt Examples

**Prompt:** 
```
What were the status and error logs for the 'finance' job yesterday?
```

**Response:** 
```
**Pipeline Run Audit: Finance ETL**

*   **Job:** `daily_financial_report`
*   **Date Checked:** 2026-06-11
*   **Last Run Status:** Failure (ID: 987)
*   **Error Log Snippet:** Connection timeout detected while accessing the 'raw_ledger' source. Check credentials or network access.

> *Action required: Credentials need updating for the ledger connection.*
```

**Prompt:** 
```
Show me all critical data assets and if they are fresh right now.
```

**Response:** 
```
**Asset Lineage Report**

The following 4 key assets were found:

| Asset Name | Last Materialization | Dependencies | Status |
| :--- | :--- | :--- | :--- |
| `cleaned_user_profiles` | 2 hours ago | `raw_auth_logs` | ✅ Fresh |
| `monthly_sales_summary` | 3 days ago | N/A | ⚠️ Stale (Needs Run) |
| `product_inventory_view` | Never found | N/A | ❌ Missing |

Please run the pipeline for `monthly_sales_summary`.
```

**Prompt:** 
```
Which jobs are scheduled to run next week?
```

**Response:** 
```
**Scheduled Jobs Report**

Here are all active schedules:

*   **`daily_etl_job`:** Runs every day at 1:00 AM UTC.
*   **`weekly_ml_retrain`:** Runs every Monday at 6:00 AM UTC. (Checks for external triggers).
*   **`hourly_sync`:** Runs every hour on the hour, unless a sensor detects an event.
```

## Capabilities

### Review all available jobs
List the names and boundaries of every data pipeline job configured in your Dagster instance.

### Check historical job run status
Fetch a chronological list of recent job runs, allowing you to select specific runs for detailed status or execution logs.

### Map data asset dependencies
Enumerate all software-defined assets in your project to understand what data relies on which source.

### Audit automated triggers
List every configured job schedule and active sensor, verifying exactly how and when pipelines are supposed to run automatically.

## Use Cases

### Investigating a failed ETL job
A data engineer notices the morning sales metrics are missing. They ask their agent, who uses `list_jobs` to find the 'daily-sales' pipeline, then calls `get_run` to check the last execution logs, confirming the failure was due to an upstream dependency.

### Verifying data freshness for a report
An analytics engineer needs to know if their board dashboard is using current data. They ask about assets, and the agent uses `list_assets` to enumerate all required tables, allowing them to verify that the 'cleaned_customer' asset was recently materialized.

### Auditing scheduled maintenance
A platform manager needs to ensure a backup pipeline runs exactly when expected. They ask the agent to list schedules, and it uses `list_schedules` to confirm that the 'daily-backup' job is correctly configured for 2:00 AM UTC.

### Debugging unexpected triggers
An SRE finds a pipeline ran when no one should have triggered it. They ask the agent to list sensors, and it uses `list_sensors` to identify that an external event listener is running too broadly, allowing them to narrow down the scope.

## Benefits

- Instant troubleshooting: Instead of opening the UI to check status, you can use `list_runs` or `get_run` with your agent to pull detailed logs for failed jobs immediately.
- Dependency mapping: The `list_assets` tool allows you to see which data tables rely on others, making it easy to verify data lineage and pinpoint the source of stale metrics.
- Complete automation audit: Use the MCP to list both job schedules (`list_schedules`) and active sensors (`list_sensors`), ensuring every automated trigger is configured correctly across all environments.
- Holistic visibility: You can view all available pipelines by calling `list_jobs` in one chat command, giving you a single overview of your entire data platform boundary.
- Deep operational insight: By querying the system via natural language, you gain immediate access to run status and asset details without needing specialized CLI commands or dashboard filters.

## How It Works

The bottom line is that you manage complex data operations using chat commands instead of navigating multiple dashboards.

1. Subscribe to this MCP on Vinkius and provide your Dagster URL along with a valid User API Token.
2. Your AI client authenticates the connection, granting it read access across your data platform's metadata.
3. You issue a natural language command—like 'Show me all failed runs for the marketing job.'—and the agent retrieves and displays the specific results.

## Frequently Asked Questions

**How do I check if my data pipelines are running correctly using Dagster MCP for AI Agents?**
You simply ask your agent about pipeline status. It uses the `list_runs` and `get_run` tools to pull historical execution logs, letting you instantly see if a job succeeded or where it failed.

**Can Dagster MCP help me track data dependencies across different tables?**
Yes. You can ask the agent to list all software-defined assets using `list_assets`. This shows you exactly which pieces of data rely on others, helping you map out your full data lineage.

**What if I need to know when my automated jobs are supposed to run?**
You can audit all automatic triggers. By asking the agent about schedules and sensors, it will list every configured job schedule and any external event listeners, giving you full visibility into automation.

**Does Dagster MCP only work if I have a complex setup?**
No. The tool is designed to talk to your existing Dagster instance (whether Plus or self-hosted). You just need the URL and API token, and you can start querying job boundaries right away.

**Is this MCP better than using a regular dashboard UI?**
For quick checks and troubleshooting, yes. It's faster because you don't have to navigate menus; you just ask the question in plain English and get the specific data result immediately.