# Nomad MCP

> HashiCorp Nomad MCP connects your AI client directly to your cluster. This lets you manage complex workloads, check node health, and track deployments using natural conversation. You stop clicking through dashboards; you just ask your agent what's going on with your infrastructure.

## Overview
- **Category:** loved-by-devs
- **Price:** Free
- **Tags:** workload-orchestration, container-management, cluster-monitoring, deployment, infrastructure-as-code

## Description

You need visibility into a constantly moving target: your production cluster. Instead of juggling the Nomad UI to find out if an allocation succeeded or why a node is showing degraded health, this MCP lets you talk to your orchestration layer. Your AI client interprets your request and runs the necessary checks against your live environment. You can list all running jobs and instantly see their configurations, monitor resource usage across every client node, or even manage rollbacks by failing an underperforming deployment. It's about operational control without context switching. By connecting to this MCP through Vinkius, you give your agent a single pane of glass for infrastructure management—whether it's from Cursor in your IDE or Claude on your desktop. You get direct access to the state of every job and every service running right now.

## Tools

### fail_deployment
Marks a deployment as failed, typically to initiate an automated rollback process.

### get_allocation
Retrieves specific operational details for a single task allocation instance.

### get_deployment
Fetches detailed information about a particular deployment cycle.

### get_job
Gets specific configuration and status details for a registered job type.

### get_node
Retrieves detailed resource usage and operational status for a single cluster node.

### list_allocations
Generates a list of all currently running task allocations across the cluster.

### list_deployments
Provides an overview and history of recent deployment activities.

### list_jobs
Lists every registered job type within your Nomad cluster.

### list_nodes
Lists all client nodes connected to the cluster, showing their current status.

### promote_deployment
Manually advances a deployment cycle to move it to a higher operational state.

## Prompt Examples

**Prompt:** 
```
List all active jobs in the 'production' namespace.
```

**Response:** 
```
Retrieving jobs... In the 'production' namespace, I found 5 active jobs: 'API-Gateway' (Service), 'Redis-Cluster' (Service), and 3 batch processing jobs. Would you like the detailed status for the API Gateway?
```

**Prompt:** 
```
Check the status of all client nodes in the cluster.
```

**Response:** 
```
Fetching cluster nodes... You have 12 registered client nodes. 10 are 'ready' and healthy, while 2 are currently 'down' or unreachable. Shall I retrieve the resource usage for the healthy nodes?
```

**Prompt:** 
```
Promote the deployment with ID 'dep-98765'.
```

**Response:** 
```
Promotion initiated... The deployment 'dep-98765' has been successfully promoted. The rolling update will now continue to the remaining allocations. I'll monitor the status for you.
```

## Capabilities

### Check Cluster Status
List all registered client nodes and retrieve current resource usage metrics.

### Inspect Workloads
Retrieve detailed information about running task allocations, including specific task details.

### Review Job Definitions
Fetch the complete configuration and current status for any registered job type.

### Monitor Deployments
Track progress on rolling updates or get specific details about past deployments.

### Control Rollbacks
Manually promote a successful deployment or fail an underperforming one to trigger rollbacks.

## Use Cases

### Investigating a failing service during peak hours
A developer notices high error rates. They ask their agent to check the cluster status, which immediately uses `list_nodes` to flag two nodes as unreachable. The agent then runs `get_node` on those specific IDs and identifies a resource saturation issue, giving them the exact fix location.

### Auditing compliance for infrastructure changes
An infra manager needs to report on which services were deployed last week. They use their agent to run `list_deployments` and then pull detailed records using `get_deployment` IDs, compiling a ready-made audit trail without manual data export.

### Mid-rollout correction
A deployment is stuck at 50% completion. The engineer tells the agent to check the status (`list_jobs`), determines which phase failed, and then uses `promote_deployment` on the problematic version ID to force the update forward.

### Troubleshooting a single bad task
A service is intermittently failing. Instead of looking at the general job status, the agent runs `list_allocations`, finds the specific allocation instance, and uses `get_allocation` to read the exact error logs for that one task.

## Benefits

- Stop opening the Nomad UI just to check status. You can ask your agent for a list of jobs or node details and get an immediate, structured answer.
- Need to know what's running? Use `list_allocations` to see every active task instance without navigating through multiple dashboards.
- Handling bad deployments is simple. Instead of manually clicking a rollback button, you can tell your agent to use `fail_deployment` and trigger the necessary recovery.
- Get deep insights by using `get_job` or `get_node` with unique IDs. This gives you metadata for specific components that general listing tools miss.
- The process of deployment management is now conversational. You can follow progress via `list_deployments` and then manually move things forward using `promote_deployment` when ready.

## How It Works

The bottom line is: Your AI client turns complex infrastructure APIs into simple conversation prompts.

1. Subscribe to this MCP and provide your Nomad Address and optional ACL Token.
2. Connect your preferred AI client (like Cursor or Claude) to the Vinkius catalog.
3. Tell your agent what you need—for example, 'Show me all nodes that are down'—and it executes the query directly against your cluster.

## Frequently Asked Questions

**How do I list all running services with HashiCorp Nomad MCP?**
You run 'list jobs.' This tool gives you an overview of every job type registered in your cluster, providing the necessary context to know what workloads are available.

**Can I check the health of a specific node using HashiCorp Nomad MCP?**
Yes. You use 'get node' and provide the unique node ID. This fetches detailed resource usage metrics, letting you pinpoint exactly why that machine might be struggling.

**What does promoting a deployment with HashiCorp Nomad MCP actually do?**
The 'promote deployment' tool advances a running job cycle to the next stage. You use it when a canary release has proven stable and you want to move the entire service up.

**Is this MCP used for pure coding tasks or infrastructure?**
This is purely for infrastructure control and monitoring. Use it to check job status, list nodes, and manage deployments; don't use it to write application logic.

**Which tool should I use if I only want task details?**
Use 'get allocation.' This provides the most granular data point—the specific operational status of a single, running task instance.