Nomad MCP. Manage your entire workload lifecycle from chat.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
HashiCorp Nomad. Manage your entire workload orchestration lifecycle directly through your AI agent. Check cluster health, list all jobs, track deployments, and manually promote or fail services without touching the Nomad UI.
This server gives your AI client operational control over job status, node resources, and allocation details.
What your AI agents can do
Fail deployment
Marks a specified deployment as failed.
Get allocation
Retrieves detailed information for a single task allocation instance.
Get deployment
Gets specific details about a deployment instance.
Lists all active client nodes and provides resource usage details for the cluster.
Lists all registered jobs and fetches their full configuration and current operational status.
Retrieves a list of recent deployments and tracks the progress of updates.
Lists all currently running task instances and lets you inspect the details of specific tasks.
Fetches detailed metadata for a single node, job, deployment, or allocation using its unique ID.
Allows manual promotion or failing of a deployment to trigger automated rollbacks.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
HashiCorp Nomad MCP Server: 10 Tools for Workload Control
Use these 10 tools to query, monitor, and manually control every aspect of your Nomad cluster state through natural conversation.
019d75dffail deployment
Marks a specified deployment as failed.
019d75dfget allocation
Retrieves detailed information for a single task allocation instance.
019d75dfget deployment
Gets specific details about a deployment instance.
019d75dfget job
Fetches detailed information for a specific job.
019d75dfget node
Retrieves detailed information for a specific client node.
019d75dflist allocations
Lists all currently running task allocations across the cluster.
019d75dflist deployments
Lists recent deployment records for tracking history.
019d75dflist jobs
Lists all registered jobs within the Nomad cluster.
019d75dflist nodes
Lists all active client nodes in the cluster.
019d75dfpromote deployment
Promotes a deployment to advance its status.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with HashiCorp Nomad, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
You connect your AI agent to your Nomad cluster, and you get operational control over your entire workload management stack. You can check cluster health, list all jobs, track deployments, and even manually promote or fail services without ever touching the Nomad UI. This server gives your agent direct operational control over job status, node resources, and allocation details.
View Cluster Status: Your agent lists all active client nodes and gives you the resource usage details for the whole cluster. View Jobs: It lists every registered job and pulls the full configuration and current operational status for each one. Check Deployments: You can pull a list of recent deployments to track update progress. Inspect Allocations: Your agent lists all currently running task instances, letting you dig into the details of specific tasks. Get Specific Resources: You fetch detailed metadata for a single node, job, deployment, or allocation just by giving it a unique ID. Manage Deployment Lifecycle: You can manually promote or fail a deployment to trigger an automated rollback.
To use it, you'll let your AI client connect and authenticate against the cluster endpoint. You just tell your agent what you need in plain language. It runs the necessary tools and gives you the answer.
When you use list_nodes and list_jobs, you see all the active client nodes and every registered job. You use get_node to get deep info on one node, and get_job to get deep info on one job. You'll use list_allocations to see all running task instances, and you can check specific task details using get_allocation.
You can track what's happening with list_deployments and get details on a specific update with get_deployment. For the big moves, you can use promote_deployment to advance a status, or fail_deployment to mark a deployment as failed.
How Nomad MCP Works
- 1 Subscribe to the server and provide your Nomad Address. If your cluster requires it, enter your ACL Token.
- 2 Your AI client connects and uses natural language to ask for status updates (e.g., 'List all jobs').
- 3 The agent executes the specific tool (like
list_jobsorget_node) and presents the structured data back to you.
The bottom line is, you manage your cluster state through conversation, not through UI clicks.
Who Is Nomad MCP For?
The DevOps, SRE, and Infrastructure Engineers who wake up tired of logging into the Nomad UI just to answer a status question. You need to monitor job health, track service rollouts, and manually intervene when things go sideways—all without opening a browser tab. It's for the person who needs immediate, authoritative cluster state.
Checks cluster health and status immediately. Uses list_nodes or list_jobs to verify if a service is running where it should be.
Monitors deployment progress. Uses list_deployments and promote_deployment to manage canary releases or trigger rollbacks.
Gathers specific data for reports. Uses get_node or get_job to pull metadata on specific resources for external monitoring tools.
What Changes When You Connect
- See the full cluster picture by running
list_nodes. You get a list of every client node, and you can immediately check its health status and resource usage. - Track service changes without opening a dashboard. Use
list_deploymentsto get a history of recent rollouts and see exactly where a deployment stands. - Pinpoint failing tasks instantly. By calling
get_allocation, you pull up the full metadata for a single task instance, telling you exactly why it's down. - Control rollbacks with explicit commands. If a service fails, use
fail_deploymentto mark it as failed, which kicks off the necessary rollback process. - Deep dive into any resource. Need to know everything about a specific job? Call
get_jobto pull its complete configuration and status. - Manage the lifecycle of services. You can use
promote_deploymentto manually move a deployment forward, accelerating canary stages when ready.
Real-World Use Cases
Checking Post-Incident Status
The incident response team needs to know if the core API gateway is stable. They ask their agent to run list_jobs. The agent reports the API-Gateway job status, confirming that the required 10/10 instances are running and healthy, allowing them to move to the next fix.
Stuck Deployment Rollout
A deployment for the payment service gets stuck in a half-way update. The engineer asks the agent to list_deployments to see the timeline. They then use get_deployment on the specific ID to check the current status, realizing a node is blocking the rollout, and manually intervening.
Investigating Resource Drain
The monitoring team notices high resource usage on one node. They ask the agent to list_nodes. The agent shows the node list and the resource usage details. The team then uses get_node with the problematic node's ID to confirm the CPU spike and isolate the issue.
Scaling Up a Service
A critical new feature needs to go live to 100% of users. The developer asks the agent to get_job for the feature service. After verifying the job config, they use promote_deployment to move the service from the canary stage to full production, confirming the scaling was successful.
The Tradeoffs
Reading Logs Only
Relying on the Nomad UI logs to figure out why a job failed. You spend 15 minutes clicking through different tabs, searching for the failure code, and cross-referencing resource IDs.
→
Ask your agent to run get_allocation with the specific allocation ID. It pulls the full failure metadata immediately, telling you the exact task failure reason and why it couldn't restart.
Manual State Chaining
Calling list_jobs to get a job ID, then manually taking that ID to run get_job, and then taking a resource ID from that output to run get_node. This is fragile and takes multiple steps.
→
Ask your agent to 'Show me the status of the Payments job and the nodes it runs on.' The agent handles the necessary sequence of get_job and list_nodes calls to give you a single, cohesive answer.
Assuming Full Visibility
Assuming that just because a job is 'running' that it's fully healthy. The job might be allocated, but the deployment might be stalled, or the node might be struggling.
→
Always check the deployment status first. Use get_deployment to see the full rollout progress and check the list_allocations tool to confirm if the instances are actually healthy and not just 'running'.
When It Fits, When It Doesn't
Use this server if your job is monitoring, debugging, or controlling service state. You need to know why a job failed or where a resource is struggling. This is for the operational phase of your MLOps pipeline.
Don't use this if you are simply writing a new service manifest or defining the initial desired state. For that, you'll use dedicated infrastructure-as-code tools. You're using this to manage the runtime state, not the definition.
If you need to know the cluster state, use list_nodes or list_jobs. If you need to fix the state, use fail_deployment or promote_deployment after checking the status with get_deployment. Never try to guess the state; always run a list command first.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Nomad. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 10 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Debugging cluster failures shouldn't require 15 tabs and 4 different dashboards.
Today, if the API gateway deployment fails, you open the Nomad UI. You jump to the 'Jobs' tab to check the job ID. Then you switch to 'Nodes' to see resource utilization. You have to manually cross-reference the failure message in the logs against the node's available capacity. It’s a painful cycle of clicking, copying IDs, and context-switching.
With this MCP server, you tell your agent, "The API gateway failed. Check the status." It runs `list_jobs` and `get_deployment` in sequence. It finds the failure, tells you which node is struggling, and gives you the actionable data in a single chat response. No tabs required.
HashiCorp Nomad MCP Server: Get operational control over deployments.
You no longer need to manually trigger rollbacks or promote a service by navigating to the deployment record and clicking the 'Promote' button. Your agent handles this. You just tell it, "Promote the staging environment deployment," and it executes `promote_deployment` and monitors the outcome.
The difference is control. You move from waiting for a UI workflow to executing a direct, conversational command that manages the actual state of your production services. It's instant.
Common Questions About Nomad MCP
How do I check the health of all nodes using the HashiCorp Nomad MCP Server? +
Run list_nodes to see every client node. This tool reports the overall status and resource utilization for all nodes in your cluster at a glance.
What is the difference between `get_job` and `get_deployment`? +
get_job provides the high-level, static configuration for a job. get_deployment tracks the active, dynamic process of rolling out or updating that job over time.
Can I manually roll back a service using the HashiCorp Nomad MCP Server? +
Yes. You use fail_deployment to mark the current deployment as failed, which triggers the necessary rollback mechanism defined in your Nomad configuration.
Does the HashiCorp Nomad MCP Server handle resource quotas? +
The server provides tools like get_node and list_allocations that expose resource usage and allocation details, allowing you to monitor quota consumption.
How do I see all running services? +
Run list_allocations. This tool lists every running task instance across the cluster, giving you a real-time view of all active workloads.
How do I use the `list_allocations` tool to check running task instances? +
The list_allocations tool provides a list of all currently running task allocations. You can inspect specific task details for each allocation to see which services are active and what resources they're using.
What is the best way to check the status of a specific deployment using `get_deployment`? +
You pass the deployment ID to get_deployment. This tool returns the full status, including the progress of rolling updates and any associated error messages. It's useful for tracking ongoing changes.
Does the HashiCorp Nomad MCP Server require an ACL token for all operations? +
While an ACL token is optional, providing one is recommended for security. The agent uses the token to authorize operations, ensuring only approved actions are taken against your cluster.
How do I get my Nomad Address? +
Your Nomad Address is the URL where your Nomad server is reachable, usually including the port (e.g., http://nomad.example.com:4646).
Is an ACL Token required? +
Only if ACLs are enabled on your Nomad cluster. If enabled, you must provide a Secret ID with sufficient permissions to query the requested resources.
Can I use namespaces with this server? +
Yes! You can specify a target namespace in the authentication settings. The server will then filter all job and allocation queries to that specific namespace.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
Gitea
Manage self-hosted Git via Gitea — list and manage repositories, track issues and pull requests, handle organizations, and audit branches directly from any AI agent.
Fly.io
Manage edge infrastructure via Fly.io — monitor apps and machines, scale compute horizontally, handle persistent volumes, and run remote commands directly from any AI agent.
Linear (Issue Tracking & PM)
Manage product development via Linear — track issues, monitor sprint cycles, and audit team projects.
You might also like
Text Diff Engine
Exact character-by-character string comparison. Stop relying on LLM summaries and get absolute text diffs for code and contracts.
Brex
Equip your AI to navigate your Brex suite. Spin up virtual cards, route new team members, and check daily cash allocations through natural chat.
Nimble CRM (Social Sales CRM & Contacts API)
Manage your Nimble CRM contacts and user profile directly through AI — list, retrieve, and manage social sales data seamlessly.