# Better Stack MCP for AI Agents MCP

> Better Stack connects your infrastructure monitoring, incident management, and on-call coordination into a single conversation stream. It lets you talk to your service health data—checking if an HTTP endpoint is down, listing active incidents, or finding out who's covering the system this week—all directly from your AI client.

## Overview
- **Category:** ship-it
- **Price:** Free
- **Tags:** uptime-monitoring, incident-response, sre, alerting, heartbeat-monitoring, infrastructure-monitoring

## Description

Forget clicking through multiple dashboards just to check if your API gateway is up or who should be paged at 3 AM. This MCP connects your Better Stack account, letting you manage complex infrastructure monitoring and incident details using only natural conversation.

It lets you list all your HTTP monitors, verify the status of critical endpoints, and even pull historical records for post-mortems. Need to know who's responsible for system health this month? You can check current on-call schedules right away. Monitoring cron jobs via heartbeat logs or seeing if a public status page is accurate are simple commands. All these deep operational checks run through your AI client, giving you real-time context without ever leaving your workspace. By connecting it through the Vinkius catalog, you get instant access to this powerful monitoring layer alongside thousands of other tools.

It’s about turning a complex web of dashboards into simple questions.

## Tools

### create_monitor
Set up a brand new uptime monitor for a service endpoint.

### delete_monitor
Remove an existing, unused monitoring rule from your stack.

### get_incident
Fetch specific details about one known incident ID.

### get_monitor
Retrieve all technical specifications for a single, named monitor.

### list_heartbeats
Show the status of multiple recurring tasks or automated jobs.

### list_incidents
View a summary list of recent outages and service disruptions.

### list_monitors
Pull a full inventory listing of every uptime monitor you run.

### list_on_calls
See the current and upcoming on-call rotation schedule for your team.

### list_status_pages
List all public status pages associated with your services.

### update_monitor
Modify the settings, endpoints, or frequency of an existing monitor.

## Prompt Examples

**Prompt:** 
```
What services are currently failing and who should I call about them?
```

**Response:** 
```
**🚨 ALERT: Service Outage Detected 🚨**

*   **Primary API Gateway:** Down (Last check: 5 minutes ago).
*   **User Profile Endpoint:** Degraded performance.

I've checked the on-call schedule. For API issues, **Sarah Miller** is currently responsible until Friday. She can be contacted via Slack for immediate details.
```

**Prompt:** 
```
Show me all my monitors and tell me which ones need updating.
```

**Response:** 
```
Here's an inventory of your 15 active monitors:

| Monitor Name | Status | Last Check | Action Needed? |
| :--- | :--- | :--- | :--- |
| Auth Gateway | OK | 2 mins ago | No |
| Legacy API Ping | **DOWN** | 4 hours ago | Yes (Needs update) |
| Billing Cron Job | OK | Just now | No |

Please review the 'Legacy API Ping' monitor; its endpoint appears outdated and needs manual adjustment.
```

**Prompt:** 
```
What was the most recent major incident, and what were the root causes?
```

**Response:** 
```
**Incident Report: Core Service Disruption (2 days ago)**

*   **Status:** Resolved.
*   **Impacted Services:** User Authentication, Payment Processing.
*   **Root Cause:** A database connection pool exhaustion. The incident was triggered by an unexpected traffic spike during peak hours. 

I recommend reviewing the `get_incident` report for full details on mitigation steps.
```

## Capabilities

### View and manage service uptime monitors
Create, read, update, or delete specific HTTP, Ping, or Keyword monitors to ensure service availability.

### Review current system incidents
Get real-time updates on active outages or pull detailed reports on past incidents for root cause analysis.

### Check team rotation schedules
List who is currently assigned to be on call, viewing rotations and coverage until a specific date.

### Verify scheduled job health
Retrieve logs for heartbeat monitors to track the status of recurring tasks or cron jobs.

### Check public service status pages
List and verify the current published status page details without logging into a separate dashboard.

## Use Cases

### Finding out why a core service is down
An engineer asks the agent, 'Show me all HTTP endpoints that are failing.' The agent uses `list_monitors` and identifies two critical services. This saves minutes of clicking through multiple dashboard tabs during an active outage.

### Determining who to call after hours
A developer asks, 'Who is on-call for the database team this week?' The agent uses `list_on_calls` and provides Sarah Miller's name and rotation dates. This prevents unnecessary wake-up calls and ensures proper escalation.

### Investigating a sporadic failure
A manager asks, 'What happened with the primary API last month?' The agent runs `list_incidents`, providing historical reports that help them pinpoint the root cause for reporting purposes.

### Verifying scheduled cleanup tasks
An ops engineer needs to confirm if a nightly backup job ran successfully. They ask about heartbeats, and the agent uses `list_heartbeats` to confirm the last ping was successful within the expected window.

## Benefits

- Instant Incident Visibility: Get immediate, conversational updates on active incidents or historical reports using the `list_incidents` tool. No more manual dashboard dives.
- Full Monitor Control: You can manage your entire monitoring suite—from listing all monitors to creating a new one (`create_monitor`) or updating settings with `update_monitor`—all via text prompts.
- Clear Accountability: The `list_on_calls` tool tells you exactly who is responsible for system health, eliminating confusion during an outage and speeding up response times.
- Background Task Health Check: Use `list_heartbeats` to monitor cron jobs. You can verify if recurring tasks are running on schedule without touching the scheduler UI.
- Simplified Status Checks: The MCP allows you to check public status pages (`list_status_pages`) and system monitors in one go, consolidating your operational visibility.

## How It Works

The bottom line is you get conversational control over complex infrastructure health data.

1. Subscribe to this MCP in your preferred AI client.
2. Enter your Better Stack API token to authenticate access.
3. Ask your agent specific questions, like 'List all monitors that are currently down,' and it pulls the live data.

## Frequently Asked Questions

**How does Better Stack MCP help me monitor my infrastructure uptime?**
It gives you conversational access to all your monitors. You can ask about specific endpoints, see a full list of services, or check if any are currently down without ever logging into the monitoring dashboard itself.

**Can Better Stack MCP tell me who is on call right now?**
Yes, it uses `list_on_calls` to show you current rotations and schedules. This means you never have to guess who owns an issue after hours; the agent tells you immediately.

**What kind of incidents can Better Stack MCP help me track?**
You can pull both live alerts on active outages and detailed historical reports for past disruptions. This is useful when reporting to management or doing a post-mortem analysis.

**Do I need to worry about cron job status with Better Stack MCP?**
No, you don't. The MCP can list heartbeats, letting you confirm if your recurring background tasks are running on schedule and that their pings are coming through as expected.

**How do I use the Better Stack MCP in my daily DevOps workflow?**
You start by asking questions about 'status' or 'schedules.' Instead of opening 5 tabs, you ask your agent to list monitors, check incidents, and verify on-call schedules all in one prompt.

**Is Better Stack MCP better than looking at my dashboard manually?**
It's faster and less error-prone. The MCP gives you a consolidated summary of the data you need right now, saving you the minutes spent navigating multiple views just to get an answer.