# Gatus Health Dashboard MCP

> Gatus Health Dashboard MCP lets you monitor your entire service infrastructure using natural conversation. List every monitored endpoint, check its real-time health status, and pull performance metrics—all without leaving your AI agent. It turns complex dashboard checking into simple questions.

## Overview
- **Category:** cloud-infrastructure
- **Price:** Free
- **Tags:** service-health, uptime-monitoring, infrastructure-alerts, real-time-status, dashboard

## Description

You don't need to open a dozen tabs or run manual scripts just to see if your services are up. By connecting this MCP, your AI client becomes a 24/7 Site Reliability Engineering assistant. You can ask it to list all monitored endpoints and instantly get their current health status across the whole system. Need to know why a service is slow? Ask for performance statistics on that specific endpoint, or pull raw metrics data suitable for deep analysis. This capability means you stop reacting to alerts and start asking questions. The entire catalog of Vinkius makes it simple; connect once from your preferred AI client and get instant visibility into system health.

It's about getting immediate answers: What is the status? How fast was it last time? Did it fail five minutes ago? Your agent handles the complexity, giving you direct insight into service availability when you need it most.

## Tools

### get_endpoint_health
Retrieves the current health status and recent results for one specific service endpoint.

### get_endpoint_stats
Pulls key performance statistics, like average latency, for a single designated endpoint.

### list_endpoints
Provides a comprehensive list of all monitored services and their current operational status across the entire system.

### get_metrics
Fetches raw, technical performance metrics data from Gatus for custom reporting and deep analysis.

## Prompt Examples

**Prompt:** 
```
List all monitored endpoints and their current status.
```

**Response:** 
```
I've retrieved the status for 8 endpoints. 7 are 'Healthy' and 1 ('payment-gateway') is currently 'Unhealthy'. Would you like me to investigate the 'payment-gateway' results?
```

**Prompt:** 
```
What is the health status of the 'core-api' endpoint?
```

**Response:** 
```
The 'core-api' endpoint is currently Healthy. The last check was successful with a 200 OK response and a latency of 45ms. Recent history shows 100% uptime over the last 24 hours.
```

**Prompt:** 
```
Show me the performance statistics for 'database-service'.
```

**Response:** 
```
Fetching stats for 'database-service'... Average latency is 12ms, with a 99th percentile of 85ms. There have been 0 failures in the last 500 requests.
```

## Capabilities

### Check overall system health
List every monitored endpoint and instantly see its current status across your entire infrastructure.

### Inspect specific endpoint history
Drill down into a service to view recent results, historical statuses, and check for patterns of failure or success using its unique key.

### Analyze performance metrics
Get detailed statistics on an endpoint's speed and reliability, identifying potential latency issues before they become outages.

### Retrieve raw data streams
Access Prometheus-compatible metrics for highly customized reporting or deep technical analysis outside the standard dashboard view.

## Use Cases

### Investigating an intermittent API outage
The Ops Engineer sees a ticket about fluctuating API availability. Instead of checking three different monitoring panels, they prompt their agent: 'What is the status and performance stats for the user-profile service?' The agent uses `get_endpoint_health` and then `get_endpoint_stats`, providing both current failure data and historical latency metrics in one response.

### Pre-launch system readiness check
The QA lead needs to confirm every single dependency is green before deployment. They ask the agent to 'List all endpoints.' The agent uses `list_endpoints`, giving a definitive, real-time status report on the entire connected infrastructure.

### Debugging slow payment processing
The Product Owner notices payment transactions are slowing down. They ask the agent to check performance for 'payment-gateway.' The agent uses `get_endpoint_stats`, confirming if the average latency is spiking and pinpointing exactly which metric changed.

### Building custom compliance reports
The Security team needs raw data on all service availability over a period for auditing. They ask the agent to 'Export Prometheus metrics for core services.' The agent executes `get_metrics`, delivering the structured, technical payload required for external analysis.

## Benefits

- Instant Visibility: Instead of manually navigating a dashboard to see if Service X is up, ask your agent. It uses the `list_endpoints` tool to give you an immediate summary of all monitored services.
- Pinpoint Failures: If something isn't working, don't guess. Use `get_endpoint_health` to check a specific service and see its most recent failure details instantly.
- Diagnose Slowness: When latency spikes, use `get_endpoint_stats`. This tool shows you the average speed and 99th percentile data so you know exactly where performance is dropping.
- Deep Dive Analysis: Need to write a report or build custom tooling? Use `get_metrics` to pull raw, Prometheus-compatible data that goes far beyond simple status checks.
- Operational Efficiency: Your AI agent acts as the ultimate SRE assistant, eliminating the need for multiple context switches between dashboards and terminals.

## How It Works

The bottom line is you get instant system health reports without ever having to manually interact with a web dashboard or run terminal commands.

1. Subscribe to this MCP and provide your specific Gatus instance URL.
2. Connect your AI agent to Vinkius. The connection authenticates your access credentials for the monitoring service.
3. Ask your agent a question, like 'What is the status of the payment gateway?' Your agent calls the necessary tools and returns the structured data.

## Frequently Asked Questions

**How do I list all monitored endpoints using Gatus Health Dashboard MCP?**
You use the `list_endpoints` tool. This instantly gives you a comprehensive, real-time roster of every service connected to your monitoring system and its current status.

**Can I check performance statistics with get_endpoint_stats?**
Yes, using `get_endpoint_stats`, you can retrieve detailed metrics for a single endpoint. This shows more than just 'up' or 'down,' giving you average latency and reliability data.

**What is the difference between get_endpoint_health and get_metrics?**
Use `get_endpoint_health` for a simple, current status check. Use `get_metrics` when you need raw, structured Prometheus-compatible data for deep technical analysis or custom reporting.

**Does Gatus Health Dashboard MCP work with multiple services?**
Absolutely. The entire MCP is designed to monitor your full infrastructure stack. You can ask about a mix of services, and the agent will use multiple tools to gather all the necessary data.

**Can I check an endpoint that failed last week using get_endpoint_health?**
While `get_endpoint_health` gives recent results, for historical trends or deep dives into past failures, you should use `get_endpoint_stats`, which tracks performance over time.