New Relic AI (LLM Observability) MCP. Get total cost and performance metrics via conversation.

Q: How does New Relic AI (LLM Observability) track token costs?

This MCP uses queryllmcosts to calculate your total LLM token spend. It gives you the exact USD consumption across different models and services, so you never lose money tracking usage.

Q: Can I check my LLM performance latency with this MCP?

Yes, use queryllmlatency. It pulls p95 latency matrices and average response times, helping you pinpoint exactly when your agent's responses slow down.

Q: Is New Relic AI (LLM Observability) read-only?

Yes. The tool uses mechanisms like customnrql which are strictly read-only queries, meaning you can pull insights without risking any changes to your live infrastructure.

Q: Does this MCP help with general system health checks?

It does. You can use tools like listapmapps and listalertpolicies to check the operational status of your entire environment, not just the LLM component.

New Relic AI (LLM Observability) lets you pull performance data, token costs, and user feedback directly from your LLMs using natural conversation. Instead of logging into dashboards to check p95 latency or calculating total USD spend, you ask your agent for the metrics immediately. Track every chat completion, audit model behavior, and verify infrastructure health—all in one place.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Give Claude and any AI agent real-world access

Audit LLM Performance Metrics

Get average response times and the 95th percentile latency data to ensure your models remain fast.

Track Token Expenditure

Calculate precise USD costs for all token usage across your entire AI infrastructure.

Review Model Interactions

Retrieve detailed chat completion messages and original prompts to audit model behavior in real-time.

Measure User Satisfaction

Fetch chronological user feedback and 1-5 rating scores provided by human supervisors.

Execute Custom Queries

Run advanced, read-only queries using the New Relic Query Language (NRQL) against your AI datasets.

Monitor Infrastructure Health

Examine active APM apps, dashboards, and alert policies to check overall system integrity.

Ask an AI about this

Waiting for input…

AI Agent

What AI agents can do with New Relic AI (LLM Observability): 10 Tools

Use these tools to manage everything from calculating precise LLM token costs and checking system latency to auditing user feedback ratings.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using New Relic AI (LLM Observability) MCP

List Alert Policies

Checks all existing automated alerts configured for the system's plan math.

List Apm Apps

Retrieves a list of currently running APM applications to validate service status.

Custom Nrql

Runs sophisticated, read-only queries using the New Relic Query Language (NRQL) for...

List Dashboards

Finds all active operational dashboards tied to native Gateway authentication.

Query Llm Errors

Identifies and lists specific error logs related to LLM processing.

Query Llm Costs

Calculates the precise monetary cost of tokens used by your agents over a specified period.

Query Llm Events

Retrieves bounded records tracking general activity within the New Relic platform.

Query Llm Feedback

Gathers human-submitted feedback and rating scores associated with LLM outputs.

Query Llm Latency

Measures the speed of your LLMs by retrieving p95 latency matrices and average...

Post Custom Event

Sends custom telemetry rows to track unique internal states or behaviors within your...

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

New Relic AI (LLM Observability) MCP is compatible with Claude

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The New Relic AI (LLM Observability) integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "new-relic-ai-llm-observability": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the New Relic AI (LLM Observability) tools with full Vinkius guardrails applied.

New Relic AI (LLM Observability) MCP is compatible with VS Code

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"new-relic-ai-llm-observability": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on each call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with New Relic AI (LLM Observability), then connect any of our 5,200+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,200+ others, all in one place
Add new capabilities to your AI anytime you want
Connections are secured and governed automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog weekly

New Relic AI (LLM Observability) MCP server cover

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by New Relic AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS CLOUD

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on each call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

The Visibility Gap: Where AI Costs and Performance Go Missing

Right now, understanding your LLM stack is a nightmare. To figure out why costs spiked or why responses slowed down, you have to jump between New Relic's billing dashboards, the APM console, and raw chat logs. You spend time clicking through tabs, copying metrics, and trying to stitch together one single story: 'It cost X dollars because it was slow.'

With this MCP, that manual process vanishes. You simply ask your agent a question like, 'What's the token usage trend over the last week?' The tool runs `query_llm_costs` and provides the answer immediately in conversation, connecting performance metrics to actual dollars spent.

Get LLM Observability with New Relic AI (LLM Observability)

Manual monitoring requires checking multiple endpoints: logging into the query interface for `query_llm_latency`, going to a separate dashboard tool to check `list_dashboards`, and then manually calculating costs via an external spreadsheet. It's slow, and it's incomplete.

Now, your agent handles all of that complexity. You get instant access to performance data, error logs (`query_llm_errors`), and resource usage checks—all through a single chat interface. Your focus shifts from dashboard maintenance to making the AI better.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

llm-monitoring

token-cost-tracking

performance-analytics

ai-observability

latency-tracking

What New Relic AI (LLM Observability) MCP does for your AI

You run complex AI agents that use Large Language Models (LLMs). Things break, costs spike unexpectedly, or performance dips when nobody is looking. This MCP connects New Relic AI to your existing agent workflow, giving you full visibility into everything happening under the hood. You can ask for total token usage across all models in dollars and cents.

Need to know why responses slow down? Check the p95 latency metrics instantly. Want to audit model behavior? Review raw chat completion messages to understand exactly what the LLM saw or generated. This access means you don't have to jump between cost dashboards, performance monitoring tools, and logs just to get a complete picture.

By connecting this MCP via Vinkius, your agent becomes an operational detective for your AI stack.

Built · Hosted · Managed by Vinkius New Relic AI (LLM Observability) - Track Costs & Latency

Server ID 019d75dc-e7ba-70bb-8f02-309d5f2787c7

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Benefits of connecting New Relic AI (LLM Observability) MCP

Stop guessing about spending. Use query_llm_costs to get the exact dollar amount of your token usage, giving you tight control over infrastructure spend.

Debug slowness fast. Running query_llm_latency provides p95 latency matrices and average response times so you know exactly when your LLM generation is dipping below acceptable speed.

Audit model behavior instantly. Instead of digging through raw logs, use the agent to retrieve detailed chat completion messages, allowing you to verify what the LLM saw or generated.

Measure quality with real data. query_llm_feedback pulls in human supervisor ratings and feedback messages, letting you spot quality regressions immediately after deployment.

Stay ahead of system decay. Running list_apm_apps and list_dashboards lets DevOps check the structural health of your entire environment without leaving the chat window.

New Relic AI (LLM Observability) MCP use cases

01 01

Debugging an unexpected cost spike

An AI Engineer notices their LLM costs are higher than normal. They ask the agent, 'What was my total token spend last week?' The agent executes query_llm_costs and reports that a specific integration caused a massive spike in usage, allowing the engineer to immediately pinpoint the source.

02 02

Checking user acceptance of new prompts

An Observability Lead wants to know if recent prompt changes affected quality. They ask the agent for query_llm_feedback. The agent pulls up a list of ratings, showing that user satisfaction dropped sharply after the change was deployed.

03 03

Validating system readiness before launch

A DevOps team member needs to ensure all monitoring is active. They instruct the agent to run list_apm_apps and check list_alert_policies. The agent confirms that all necessary applications are running and alert triggers are correctly configured.

04 04

Analyzing slow agent responses

An AI Engineer reports that sometimes the chat feels sluggish. They ask the agent to run query_llm_latency, which returns a matrix showing that the average response time exceeds 2 seconds during peak usage hours.

New Relic AI (LLM Observability) MCP tradeoffs

What to watch out for, and the recommended way to handle each one.

Over-relying on raw logs

Avoid

A developer manually filters through thousands of log entries trying to find a specific token cost or latency metric for one single transaction. This takes hours and is prone to human error.

Instead

Instead, ask the agent to run query_llm_costs or query_llm_latency. The tool aggregates this data automatically and presents the precise metrics in plain language.

Assuming system health

Avoid

The team assumes everything is fine because no alerts have triggered, without checking for underlying architectural decay. This leads to unexpected outages.

Instead

Run list_apm_apps and check list_alert_policies. This validates the operational status of every core component in your AI environment.

Ignoring user sentiment

Avoid

The team focuses only on technical performance (latency) but misses that users are finding the output inaccurate or confusing, leading to poor adoption.

Instead

Use query_llm_feedback. This retrieves direct human ratings and comments, providing a critical layer of quality monitoring beyond just technical metrics.

When to use New Relic AI (LLM Observability) MCP

Use this MCP if your primary pain point is understanding the cost, performance, or user reception of your LLM agents without navigating multiple dashboards. It's essential for observability leads who need global visibility into token consumption and latency benchmarks. You must use it when you need to answer questions like 'How much did that run cost?' or 'Why was this response slow?'

Don't use this if you just need simple, single-point data retrieval (like checking a status code). For those limited checks, an API integration might suffice. However, because of its ability to consolidate metrics—from query_llm_costs to list_apm_apps—it’s the superior choice for comprehensive auditing.

Frequently asked questions about New Relic AI (LLM Observability) MCP

How does New Relic AI (LLM Observability) track token costs? +

This MCP uses query_llm_costs to calculate your total LLM token spend. It gives you the exact USD consumption across different models and services, so you never lose money tracking usage.

Can I check my LLM performance latency with this MCP? +

Yes, use query_llm_latency. It pulls p95 latency matrices and average response times, helping you pinpoint exactly when your agent's responses slow down.

What kind of data can I audit with New Relic AI (LLM Observability)? +

You can audit everything: chat completion messages for model behavior, human supervisor feedback using query_llm_feedback, and raw internal agent states via post_custom_event.

Is New Relic AI (LLM Observability) read-only? +

Yes. The tool uses mechanisms like custom_nrql which are strictly read-only queries, meaning you can pull insights without risking any changes to your live infrastructure.

Does this MCP help with general system health checks? +

It does. You can use tools like list_apm_apps and list_alert_policies to check the operational status of your entire environment, not just the LLM component.

Give Claude and any AI agent real-world access

What AI agents can do with New Relic AI (LLM Observability): 10 Tools

List Alert Policies

Checks all existing automated alerts configured for the system's plan math.

List Apm Apps

Retrieves a list of currently running APM applications to validate service status.

Custom Nrql

Runs sophisticated, read-only queries using the New Relic Query Language (NRQL) for...

List Dashboards

Finds all active operational dashboards tied to native Gateway authentication.

Query Llm Errors

Identifies and lists specific error logs related to LLM processing.

Query Llm Costs

Calculates the precise monetary cost of tokens used by your agents over a specified period.

Query Llm Events

Retrieves bounded records tracking general activity within the New Relic platform.

Query Llm Feedback

Gathers human-submitted feedback and rating scores associated with LLM outputs.

Query Llm Latency

Measures the speed of your LLMs by retrieving p95 latency matrices and average...

Post Custom Event

Sends custom telemetry rows to track unique internal states or behaviors within your...

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

The Visibility Gap: Where AI Costs and Performance Go Missing

Get LLM Observability with New Relic AI (LLM Observability)

llm-monitoring

token-cost-tracking

performance-analytics

ai-observability

latency-tracking

What New Relic AI (LLM Observability) MCP does for your AI

How to set up New Relic AI (LLM Observability) MCP

Who uses New Relic AI (LLM Observability) MCP

Benefits of connecting New Relic AI (LLM Observability) MCP

New Relic AI (LLM Observability) MCP use cases

Debugging an unexpected cost spike

Checking user acceptance of new prompts

Validating system readiness before launch

Analyzing slow agent responses

New Relic AI (LLM Observability) MCP tradeoffs

Over-relying on raw logs

Assuming system health

Ignoring user sentiment

When to use New Relic AI (LLM Observability) MCP

Frequently asked questions about New Relic AI (LLM Observability) MCP