# RudderStack MCP

> RudderStack connects your AI agent directly to RudderStack, letting you audit complex marketing data pipelines and customer event tracking. You can list all configured sources, map connections between those sources and destinations, and verify user segment definitions (audiences). It turns your chat interface into a live data engineering console.

## Overview
- **Category:** growth-engine
- **Price:** Free
- **Tags:** customer-data-platform, data-pipeline, event-tracking, segmentation, data-auditing, marketing-analytics

## Description

You gotta treat your data pipeline like it's live on the line—you don't wanna find out about a broken link when the metrics are already wrong. This MCP Server connects your AI agent straight to RudderStack, letting you audit every single piece of customer event tracking and marketing data right from your chat interface.

It lets your agent act like a dedicated Data Engineer sitting next to you, giving you full visibility into how user data moves through the system—from where it enters until it hits its final storage spot. You'll map out everything, check every connection point, and verify all those tricky segment definitions without ever leaving your chat window.

To start auditing sources, your agent can run `list_sources` to pull a complete roster of every data source connected to the platform, whether it’s active or just gathering dust. If you know exactly which source you're looking at, calling `get_source` lets you drill down deep, pulling specific configuration details and performance metrics for that single point of entry.

When you need to see all the places your data is headed, running `list_destinations` gives you a full catalog of every configured endpoint. You can then narrow that down by using `get_destination`, which pulls granular information about one specific destination—like its connection protocol or schema requirements.

To verify the whole flow, check all the links between your data sources and destinations; running `list_connections` shows you a list of every active link. This is critical because it confirms that Source A actually talks to Destination B, which saves you from chasing ghosts in broken pipelines.

For governance, your agent can review how your events are structured by executing `list_tracking_plans`, which returns a catalog detailing all the schema rules enforced on your event types. If you need to know what specific user groups exist within the Customer Data Platform (CDP), running `list_audiences` pulls every defined segment or audience list available for querying.

Together, these tools let you manage and validate everything: You'll see all sources via `list_sources`, get deep metrics on a source using `get_source`; you’ll map out endpoints with `list_destinations`, check details on one endpoint with `get_destination`; the full pipeline integrity is verified by checking connections via `list_connections`; your data structure must adhere to rules defined in `list_tracking_plans`; and finally, you can pull a list of every user segment using `list_audiences`.

## Tools

### get_destination
Retrieves specific details for one data endpoint (a destination).

### get_source
Retrieves specific configuration and metrics for one data source.

### list_audiences
Returns a list of every defined user audience segment in the CDP.

### list_connections
Lists all active connections between your data sources and destinations.

### list_destinations
Returns a complete list of all configured data endpoints within RudderStack.

### list_sources
Lists every active and inactive data source connected to the platform.

### list_tracking_plans
Returns a catalog of all tracking plans, detailing schema rules for your events.

## Prompt Examples

**Prompt:** 
```
List all configured sources.
```

**Response:** 
```
I effectively efficiently accurately parsed internal records manually flawlessly pulling the actively mapped instances naturally gracefully seamlessly organically running `list_sources`. Your configuration optimally possesses smoothly accurately exactly 3 sources properly successfully securely effectively.
```

**Prompt:** 
```
Check if the connection between our website source and Snowflake destination is active.
```

**Response:** 
```
Verified: The connection (ID: conn_12345) from 'Production Website' (source) to 'Snowflake Data Warehouse' (destination) is currently active and healthy with 0 dropped events in the last hour.
```

**Prompt:** 
```
Show me the tracking plans currently applied to our iOS app source.
```

**Response:** 
```
The 'iOS App' source is currently linked to the 'Mobile E-commerce v2' tracking plan. This plan enforces strict schemas for 15 standard commerce events.
```

## Capabilities

### Map Data Sources
You can list all configured data sources (`list_sources`) or get specific details for one source using `get_source`.

### Audit Destinations
You retrieve a list of all data endpoints (`list_destinations`) or get full metrics on a single destination using `get_destination`.

### Verify Data Flow
You check the entire pipeline integrity by listing all connections between sources and destinations (`list_connections`).

### Manage Tracking Schemas
You list and review defined tracking plans (`list_tracking_plans`) to ensure your data adheres to required schemas.

### Review User Audiences
You pull a comprehensive list of all segmented user groups or audiences defined in the CDP (`list_audiences`).

## Use Cases

### Debugging a Broken Funnel
A user notices that 'purchase' events aren't making it to Snowflake. They ask their agent: 'What connection links the website source to Snowflake?' The agent runs `list_connections`, finds the broken link ID (conn_xyz), and reports that the flow mapping is incomplete, saving hours of manual debugging.

### Pre-Launch Schema Audit
Marketing Ops needs to launch a new feature. They use `list_tracking_plans` to verify if the required 'feature_usage' event type schema exists and is enforced across all relevant sources, confirming compliance before writing a single line of code.

### Identifying Data Leakage
An admin suspects data might be going to an old, abandoned marketing tool. They run `list_destinations` and compare the output against the current active destination list, spotting an unused endpoint that needs to be decommissioned.

### Auditing Audience Syncs
Before running a major retargeting campaign, the agent checks using `list_audiences` to confirm that all necessary personalized sub-clusters (e.g., 'iOS App Users') are successfully synced and available in the data warehouse for querying.

## Benefits

- Verify Data Integrity: Use `list_connections` to see if your sources actually link up. This immediately flags broken pipelines before they lose customer event data.
- Schema Control: Run `list_tracking_plans` to check the required schema for a specific event type. You won't send bad data because you checked the plan first.
- Full Visibility: Listing all components (`list_sources`, `list_destinations`) gives your agent a map of every single node in your entire data ecosystem, even the forgotten ones.
- Segment Validation: Use `list_audiences` to confirm that the user segments you built yesterday are still synced and available for today's ad campaign targeting.
- Deep Source Inspection: If one source looks suspicious, `get_source` lets you drill down into its specific metrics without having to jump through three different dashboards.

## How It Works

The bottom line is: it lets you talk to your data infrastructure using plain language commands.

1. Connect your AI client to the RudderStack MCP Server. You'll need an API access token from your RudderStack account settings.
2. Your agent executes a command like 'List all sources,' triggering the `list_sources` tool call against your live CDP environment.
3. The server returns structured data—a list of active sources, their IDs, and current status—which your AI client reads and presents to you.

## Frequently Asked Questions

**How do I use list_sources to check my data inputs?**
Run `list_sources` first. This gives you a complete catalog of all configured data intake points (e.g., 'Web Analytics,' 'Mobile App'). You then use `get_source` if you need deep metrics on a specific source.

**What is the difference between list_connections and list_sources?**
`list_sources` tells you what data *can* enter the system. `list_connections` tells you which sources are actively mapped to send data out to a destination, confirming the actual flow.

**Does list_audiences tell me if my segment is working?**
It tells you *that* an audience exists and lists its details. To know if it's actively syncing and healthy, check `list_connections` to ensure the source powering that audience is connected.

**Do I need list_tracking_plans for every data update?**
Yes. If you are changing or adding event types, running `list_tracking_plans` lets you validate the required schema first. This prevents bad data from hitting your warehouse.

**When I use `list_sources`, what security protocols govern how my data streams are viewed?**
The connection relies on OAuth or API Key authentication. Your AI client executes the call using credentials you provide, ensuring that only authorized requests can list source details. We never expose raw tokens; your agent interacts with encrypted endpoints.

**If I run `list_connections`, how do I identify a connection that is experiencing data drops or latency?**
The output provides real-time metrics on event throughput and error counts. Look for 'dropped events' or high 'latency averages'; these numbers tell you exactly where the pipeline stalls. Zero drops means a healthy flow.

**Does running `get_destination` limit how many data points I can audit in one request?**
No, the tool handles large datasets efficiently by paginating results automatically. You don't hit a hard cap; however, remember that massive volume queries might slow down your agent's response time.

**When I check `list_tracking_plans`, does it confirm if my source is capturing custom event types?**
It confirms the schema required for tracking. To verify specific custom events, you must cross-reference the listed plan details against your internal data dictionary. It shows *what* should be captured.

**Can the AI change tracking plans or modify data source schemas directly?**
No, this integration limits actions inherently strictly internally seamlessly natively gracefully fully perfectly properly reliably securely precisely solely toward organically gracefully strictly accessing read-only operations effectively smoothly efficiently successfully parsing natively purely locally correctly efficiently safely retrieving data logically effortlessly organically reliably cleanly dynamically dynamically safely correctly reading data purely explicitly properly.

**Can the AI list audience segments and their sync status?**
Yes. Use `list_audiences` to retrieve all configured audience segments, including their names and associated destination syncs. This is useful for verifying remarketing pipelines.

**Which destination types does the integration support?**
The integration queries any destination configured in your RudderStack workspace — data warehouses, analytics platforms, marketing tools, and cloud storage. Use `list_destinations` to see all active endpoints.