New Relic AI (LLM Observability) MCP. Get total cost and performance metrics via conversation.
New Relic AI (LLM Observability) lets you pull performance data, token costs, and user feedback directly from your LLMs using natural conversation. Instead of logging into dashboards to check p95 latency or calculating total USD spend, you ask your agent for the metrics immediately. Track every chat completion, audit model behavior, and verify infrastructure health—all in one place.
Give Claude and any AI agent real-world access
Get average response times and the 95th percentile latency data to ensure your models remain fast.
Calculate precise USD costs for all token usage across your entire AI infrastructure.
Retrieve detailed chat completion messages and original prompts to audit model behavior in real-time.
Fetch chronological user feedback and 1-5 rating scores provided by human supervisors.
Run advanced, read-only queries using the New Relic Query Language (NRQL) against your AI datasets.
Examine active APM apps, dashboards, and alert policies to check overall system integrity.
Ask an AI about this
Waiting for input…
What AI agents can do with New Relic AI (LLM Observability): 10 Tools
Use these tools to manage everything from calculating precise LLM token costs and checking system latency to auditing user feedback ratings.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using New Relic AI (LLM Observability) MCPList Alert Policies
Checks all existing automated alerts configured for the system's plan math.
List Apm Apps
Retrieves a list of currently running APM applications to validate service status.
Custom Nrql
Runs sophisticated, read-only queries using the New Relic Query Language (NRQL) for...
List Dashboards
Finds all active operational dashboards tied to native Gateway authentication.
Query Llm Errors
Identifies and lists specific error logs related to LLM processing.
Query Llm Costs
Calculates the precise monetary cost of tokens used by your agents over a specified period.
Query Llm Events
Retrieves bounded records tracking general activity within the New Relic platform.
Query Llm Feedback
Gathers human-submitted feedback and rating scores associated with LLM outputs.
Query Llm Latency
Measures the speed of your LLMs by retrieving p95 latency matrices and average...
Post Custom Event
Sends custom telemetry rows to track unique internal states or behaviors within your...
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on each call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with New Relic AI (LLM Observability), then connect any of our 5,200+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,200+ others, all in one place
- Add new capabilities to your AI anytime you want
- Connections are secured and governed automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog weekly
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by New Relic AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS CLOUD
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on each call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
The Visibility Gap: Where AI Costs and Performance Go Missing
Right now, understanding your LLM stack is a nightmare. To figure out why costs spiked or why responses slowed down, you have to jump between New Relic's billing dashboards, the APM console, and raw chat logs. You spend time clicking through tabs, copying metrics, and trying to stitch together one single story: 'It cost X dollars because it was slow.'
With this MCP, that manual process vanishes. You simply ask your agent a question like, 'What's the token usage trend over the last week?' The tool runs `query_llm_costs` and provides the answer immediately in conversation, connecting performance metrics to actual dollars spent.
Get LLM Observability with New Relic AI (LLM Observability)
Manual monitoring requires checking multiple endpoints: logging into the query interface for `query_llm_latency`, going to a separate dashboard tool to check `list_dashboards`, and then manually calculating costs via an external spreadsheet. It's slow, and it's incomplete.
Now, your agent handles all of that complexity. You get instant access to performance data, error logs (`query_llm_errors`), and resource usage checks—all through a single chat interface. Your focus shifts from dashboard maintenance to making the AI better.
What New Relic AI (LLM Observability) MCP does for your AI
You run complex AI agents that use Large Language Models (LLMs). Things break, costs spike unexpectedly, or performance dips when nobody is looking. This MCP connects New Relic AI to your existing agent workflow, giving you full visibility into everything happening under the hood. You can ask for total token usage across all models in dollars and cents.
Need to know why responses slow down? Check the p95 latency metrics instantly. Want to audit model behavior? Review raw chat completion messages to understand exactly what the LLM saw or generated. This access means you don't have to jump between cost dashboards, performance monitoring tools, and logs just to get a complete picture.
By connecting this MCP via Vinkius, your agent becomes an operational detective for your AI stack.
019d75dc-e7ba-70bb-8f02-309d5f2787c7 How to set up New Relic AI (LLM Observability) MCP
The bottom line is you talk to your agent like talking to a teammate; it handles the complex monitoring data retrieval for you.
Subscribe to this MCP and enter your New Relic API Key and Account ID.
Connect your preferred AI client—Claude, Cursor, or any compatible agent—to Vinkius.
Ask a natural language question about your LLM activity. Your agent executes the necessary queries and reports back with performance metrics or cost breakdowns.
Who uses New Relic AI (LLM Observability) MCP
This MCP is built for anyone who owns AI infrastructure but hates manual dashboard navigation. It's for the Observability Lead who needs global token cost visibility, or the AI Engineer who gets frustrated having to check ten different logs just to debug a slow prompt.
You use this MCP to verify model accuracy and check prompt performance by pulling raw chat completion messages directly into your conversation flow.
You track global token costs and latency benchmarks in real-time, allowing you to optimize infrastructure spending without leaving your primary workspace.
You audit the structural health of your AI environment by listing active APM apps, dashboards, and checking alert policy configurations across multiple services.
Benefits of connecting New Relic AI (LLM Observability) MCP
Stop guessing about spending. Use query_llm_costs to get the exact dollar amount of your token usage, giving you tight control over infrastructure spend.
Debug slowness fast. Running query_llm_latency provides p95 latency matrices and average response times so you know exactly when your LLM generation is dipping below acceptable speed.
Audit model behavior instantly. Instead of digging through raw logs, use the agent to retrieve detailed chat completion messages, allowing you to verify what the LLM saw or generated.
Measure quality with real data. query_llm_feedback pulls in human supervisor ratings and feedback messages, letting you spot quality regressions immediately after deployment.
Stay ahead of system decay. Running list_apm_apps and list_dashboards lets DevOps check the structural health of your entire environment without leaving the chat window.
New Relic AI (LLM Observability) MCP use cases
Debugging an unexpected cost spike
An AI Engineer notices their LLM costs are higher than normal. They ask the agent, 'What was my total token spend last week?' The agent executes query_llm_costs and reports that a specific integration caused a massive spike in usage, allowing the engineer to immediately pinpoint the source.
Checking user acceptance of new prompts
An Observability Lead wants to know if recent prompt changes affected quality. They ask the agent for query_llm_feedback. The agent pulls up a list of ratings, showing that user satisfaction dropped sharply after the change was deployed.
Validating system readiness before launch
A DevOps team member needs to ensure all monitoring is active. They instruct the agent to run list_apm_apps and check list_alert_policies. The agent confirms that all necessary applications are running and alert triggers are correctly configured.
Analyzing slow agent responses
An AI Engineer reports that sometimes the chat feels sluggish. They ask the agent to run query_llm_latency, which returns a matrix showing that the average response time exceeds 2 seconds during peak usage hours.
New Relic AI (LLM Observability) MCP tradeoffs
What to watch out for, and the recommended way to handle each one.
Over-relying on raw logs
A developer manually filters through thousands of log entries trying to find a specific token cost or latency metric for one single transaction. This takes hours and is prone to human error.
Instead, ask the agent to run query_llm_costs or query_llm_latency. The tool aggregates this data automatically and presents the precise metrics in plain language.
Assuming system health
The team assumes everything is fine because no alerts have triggered, without checking for underlying architectural decay. This leads to unexpected outages.
Run list_apm_apps and check list_alert_policies. This validates the operational status of every core component in your AI environment.
Ignoring user sentiment
The team focuses only on technical performance (latency) but misses that users are finding the output inaccurate or confusing, leading to poor adoption.
Use query_llm_feedback. This retrieves direct human ratings and comments, providing a critical layer of quality monitoring beyond just technical metrics.
When to use New Relic AI (LLM Observability) MCP
Use this MCP if your primary pain point is understanding the cost, performance, or user reception of your LLM agents without navigating multiple dashboards. It's essential for observability leads who need global visibility into token consumption and latency benchmarks. You must use it when you need to answer questions like 'How much did that run cost?' or 'Why was this response slow?'
Don't use this if you just need simple, single-point data retrieval (like checking a status code). For those limited checks, an API integration might suffice. However, because of its ability to consolidate metrics—from query_llm_costs to list_apm_apps—it’s the superior choice for comprehensive auditing.
Frequently asked questions about New Relic AI (LLM Observability) MCP
How does New Relic AI (LLM Observability) track token costs? +
This MCP uses query_llm_costs to calculate your total LLM token spend. It gives you the exact USD consumption across different models and services, so you never lose money tracking usage.
Can I check my LLM performance latency with this MCP? +
Yes, use query_llm_latency. It pulls p95 latency matrices and average response times, helping you pinpoint exactly when your agent's responses slow down.
What kind of data can I audit with New Relic AI (LLM Observability)? +
You can audit everything: chat completion messages for model behavior, human supervisor feedback using query_llm_feedback, and raw internal agent states via post_custom_event.
Is New Relic AI (LLM Observability) read-only? +
Yes. The tool uses mechanisms like custom_nrql which are strictly read-only queries, meaning you can pull insights without risking any changes to your live infrastructure.
Does this MCP help with general system health checks? +
It does. You can use tools like list_apm_apps and list_alert_policies to check the operational status of your entire environment, not just the LLM component.