# PagerDuty MCP

> PagerDuty connects your AI agent directly to mission-critical incident management. Use it to list active incidents, check service health configurations, and manage on-call rotations in real time. Trigger alerts, acknowledge outages, or resolve issues without leaving your IDE—all via natural conversation.

## Overview
- **Category:** industry-titans
- **Price:** Free
- **Tags:** incident-management, on-call-scheduling, alerting, service-monitoring, escalation-policy, devops

## Description

You're gonna connect your AI agent right into PagerDuty. This server gives your client full control over incident management, so you don't gotta jump between dashboards just to fix something. You can manage services and respond to outages straight from where you're coding.

**Incident Management Tools**

The `list_incidents` tool lets you pull up a list of every open outage, letting you filter that rundown by status—you can see everything triggered, what's acknowledged, or what's already resolved. If you need the deep dive on one specific event, use `get_incident` with its unique ID; it’ll hand back all the full details: the current status, the affected service, and how bad things are. Don't wanna wait for an alert? You can proactively log a new outage using `create_incident`, giving it an email address, the specific service ID, and a title right away. When an incident pops up that needs attention, you use `update_incident` to change its state—you can acknowledge it as a known problem or resolve it when you fix the underlying mess. 

**Service Monitoring & Discovery**

To see what services are even getting monitored, run `list_services`; it gives you the complete catalog of every single thing PagerDuty tracks. Once you know which service is giving you grief, `get_service` grabs its full configuration and current health status. It shows you exactly how that specific service is set up to monitor itself.

**On-Call and Team Structure**

To figure out who's supposed to be fixing this, the system has a few tools for people. `list_oncalls` tells you right now who’s primary and secondary contacts across all active schedules. You can also see all available rotation plans using `list_schedules`, which shows every schedule name and the teams tied to it. If you need details on the actual users, run `list_users`; that outputs a list of everyone configured in the PagerDuty account. For a deep dive into one specific person, use `get_user` to pull detailed profile information, including all their contact methods.

**Workflow and Escalation Logic**

You can map out how alerts flow through the company's structure using `list_escalation_policies`. This tool displays every defined escalation chain, so you understand the routing rules and time limits for an alert. You also get a clear picture of who’s doing what with `list_users`, which outputs all configured users in the system.

**Summary Action Flow**

Your agent can take this information and act on it. It'll list every service you need to check, then pull up the current escalation rules for that service. If things look bad, your agent knows who’s on call from `list_oncalls`, and it can log a brand-new incident using `create_incident`. You can track all open incidents by calling `list_incidents` or get granular status updates by fetching details with `get_incident`. When the crisis is over, you use `update_incident` to mark it resolved. It’s everything in one place.

## Tools

### create_incident
Logs a new service incident using the provided email, service ID, and title.

### get_incident
Fetches full details about an existing PagerDuty incident via its unique ID.

### get_service
Retrieves the full configuration and status for a specific monitored service.

### get_user
Gets detailed profile information, including contact methods, for a specified user account.

### list_escalation_policies
Lists all defined escalation chains to understand incident routing rules.

### list_incidents
Retrieves a list of incidents, allowing optional filtering by their status (triggered, acknowledged, resolved).

### list_oncalls
Shows who is currently assigned as on-call across all active schedules.

### list_schedules
Lists all available rotation schedules and their associated teams.

### list_services
Retrieves a complete catalog of every service monitored by PagerDuty.

### list_users
Outputs a list of all users configured within the PagerDuty account.

### update_incident
Changes an incident's status—you can acknowledge, resolve, or reassign it programmatically.

## Prompt Examples

**Prompt:** 
```
Show me all triggered incidents right now.
```

**Response:** 
```
There are 3 triggered incidents:

1. 🔴 **P8K2LMN** — Database connection pool exhausted (Production API)
2. 🔴 **P3J9FGH** — SSL certificate expires in 24h (CDN Edge)
3. 🟡 **P1A4BCD** — High memory usage on worker-03 (Background Jobs)

Would you like me to acknowledge any of these?
```

**Prompt:** 
```
Who is on-call for the Platform team right now?
```

**Response:** 
```
Current on-call for Platform:

- **Level 1**: sarah.chen@company.com (until Apr 7, 9:00 AM UTC)
- **Level 2**: marcus.ops@company.com (until Apr 14, 9:00 AM UTC)

Escalation Policy: Platform Critical (30 min timeout per level).
```

**Prompt:** 
```
Acknowledge incident P8K2LMN and show me the service details.
```

**Response:** 
```
✅ Incident **P8K2LMN** acknowledged successfully.

**Service: Production API**
- Status: Active (Critical)
- Escalation: Platform Critical (3 levels)
- Integrations: Datadog, Sentry, Slack
- Auto-resolve: 4 hours

Would you like me to investigate the root cause or resolve this incident?
```

## Capabilities

### List All Monitored Services
Retrieves a list of every service monitored by PagerDuty.

### Check On-Call Personnel
Identifies the current primary and secondary contacts based on active schedules.

### Get Incident Details
Retrieves all information (status, service, severity) for a specific incident ID.

### Create New Outage Record
Automatically logs and creates a new incident on a specified service.

### Update Incident Status
Changes an existing incident's state (e.g., acknowledge, resolve) programmatically.

### Review Escalation Chains
Displays the defined path and timeouts for how an alert routes through multiple teams.

## Use Cases

### The Critical Outage
An alert hits: 'High latency on Auth API.' The agent runs `list_incidents` and finds the ID. It then calls `get_service` for that service to see its integrations (Datadog, Sentry). Finally, it uses `update_incident` to acknowledge the issue, giving immediate visibility into the problem's scope.

### Onboarding a New Team Member
A new engineer needs to know the current on-call rotation. They ask their agent, which runs `list_oncalls`. The response shows exactly who is covering the platform and when they hand off. This saves them 15 minutes of checking internal wikis.

### Pre-Mortem Planning
Before a major deployment, an SRE needs to verify service dependencies. They run `list_services` to get the full catalog. Then they use `get_service` on three key components to audit their respective escalation policies and required integrations.

### Investigating User Roles
An incident requires contacting a specific user role, but the agent doesn't know the ID. The engineer runs `list_users`, finds the correct profile name, and then uses `get_user` to get the necessary contact methods.

## Benefits

- **Immediate Status Checks:** Use `list_incidents` to pull a list of currently triggered alerts. You don't have to manually check the dashboard; your agent pulls the data instantly.
- **Know Who To Call:** The `list_oncalls` tool gives you real-time coverage visibility. Stop wasting time emailing people when `list_oncalls` tells you exactly who is on duty and until when.
- **Control Incident State:** When an alert pops up, use `update_incident` to acknowledge it or mark it resolved directly through conversation. This is critical for maintaining the incident timeline.
- **Deep Service Context:** Calling `get_service` provides more than just a name; you get deep configuration data, including integrations and auto-resolve timings, which informs your fix strategy.
- **Map Out Failures:** Reviewing `list_escalation_policies` helps you predict how an incident will flow. You see the full chain—Level 1 to Level 2—before it actually breaks.
- **Fast Triage Cycle:** The combination of `list_users` and `get_user` lets your agent quickly pull contact details for key personnel involved in a specific service or incident.

## How It Works

The bottom line is: you use your AI client to run specific, state-changing commands against PagerDuty's live data feed.

1. Subscribe to this server, then enter your PagerDuty REST API Key under Configuration > API Access.
2. Your AI client calls the `list_services` tool to get a list of all monitored services and their IDs.
3. You pass an incident ID or service name to tools like `get_incident` or `create_incident` to start managing the event.

## Frequently Asked Questions

**How do I check if an incident already exists using the PagerDuty MCP Server?**
You use the `list_incidents` tool. This lets you search across services and filter by status (triggered, acknowledged, etc.) without having to guess IDs or dashboards.

**Can I acknowledge an alert using the PagerDuty MCP Server?**
Yes, use `update_incident`. This tool allows your agent to change the incident's state (acknowledge, resolve) programmatically. It keeps the official record updated automatically.

**What is the best way to find out who is currently on call? PagerDuty MCP Server?**
Run `list_oncalls`. This tool pulls real-time data showing current coverage levels, including which team member is active and until what time.

**Does the PagerDuty MCP Server let me see service integrations?**
Yes. Calling `get_service` retrieves detailed configuration data for a specific service, including its list of integrated tools (Datadog, Sentry, etc.).

**How do I programmatically create a brand-new incident using the `create_incident` tool in the PagerDuty MCP Server?**
You pass your email, service ID, and an incident title. The server immediately generates the alert. This lets your agent trigger an incident instantly without needing manual steps.

**Can I use `get_user` in the PagerDuty MCP Server to check a team member's specific contact details?**
Yes, you get detailed information on any user profile. This includes their notification rules and various contact methods. It’s useful for coordinating follow-up comms outside of incident channels.

**How do I review the entire alert routing logic using `list_escalation_policies` with the PagerDuty MCP Server?**
The tool lists every defined escalation policy. You see exactly how incidents route through teams and what time limits are set for each level of response.

**Does the PagerDuty MCP Server allow me to reassign an existing incident using `update_incident`?**
Yes, you can programmatically change which user owns an active incident. This is critical when shifting ownership during a complex resolution or investigation workflow.

**Can I acknowledge and resolve incidents directly from my AI agent?**
Yes! Use the `update_incident` tool with the incident ID and your PagerDuty email. Set status to `acknowledged` or `resolved` to change the incident state instantly without opening the PagerDuty dashboard.

**How do I find out who is on-call right now?**
Run the `list_oncalls` tool. It returns every user currently on-call across all schedules and escalation levels, showing their name, escalation policy, and coverage window.

**Can I create incidents programmatically for testing or escalation?**
Absolutely. Use `create_incident` with your PagerDuty email, the target service ID, a descriptive title, and optionally set urgency to `high` or `low`. The incident will immediately trigger according to the service's escalation policy.