Vinkius
Helicone Observability

Helicone Observability MCP for AI. Track LLM Costs, Latency, and Usage in Conversation

Claude Claude
ChatGPT ChatGPT
Cursor Cursor
Gemini Gemini
Windsurf Windsurf
VS Code VS Code
JetBrains JetBrains
Vercel Vercel
See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Helicone (LLM Observability) MCP on Cursor AI Code EditorHelicone (LLM Observability) MCP on Claude Desktop AppHelicone (LLM Observability) MCP on OpenAI Agents SDKHelicone (LLM Observability) MCP on Visual Studio CodeHelicone (LLM Observability) MCP on GitHub Copilot AI AgentHelicone (LLM Observability) MCP on Google Gemini AIHelicone (LLM Observability) MCP on Lovable AI DevelopmentHelicone (LLM Observability) MCP on Mistral AI AgentsHelicone (LLM Observability) MCP on Amazon AWS Bedrock

Connect to your AI in seconds.

Helicone provides deep observability into your LLM usage by connecting directly to any AI client. It lets you track every request, analyze costs broken down by user or feature, measure real-time latency spikes, and manage prompt versions without logging into a separate dashboard.

You get full visibility across all your upstream LLM calls—all from conversation with your agent.

What your AI can do

Query costs

Calculates total spending by analyzing properties that drive account charges.

Query feedback

Inspects stored user feedback data to see what users liked or disliked about the output.

Query latency

Retrieves performance metrics, showing how fast requests were processed in real-time.

+ 7 more capabilities included
Analyze Spending

Break down total LLM spending by specific models or user groups to understand your exact operational burn rate.

Measure Performance

Identify the slowest parts of a call, measuring Time To First Token (TTFT) and pinpointing latency issues across different AI providers.

Inspect Prompts

View deep proxy logs to see the exact instructions or data sent to the LLM API calls by your agent.

Review Conversations

Isolate and analyze entire multi-turn conversation histories to debug complex, chained agentic processes.

Track Users and Feedback

Identify your most active human users or log specific user critiques (like thumbs up/down) to improve the core model grounding.

Included with Plan

Waiting for input…

AI Agent

Helicone (LLM Observability) with 10 Tools

These tools give your agent the raw data it needs to analyze costs, track performance metrics, inspect prompts, and monitor all LLM activity in detail.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Helicone (LLM Observability) on Vinkius

Query Costs

Calculates total spending by analyzing properties that drive account charges.

Query Feedback

Inspects stored user feedback data to see what users liked or disliked about the...

Query Latency

Retrieves performance metrics, showing how fast requests were processed in real-time.

Log Feedback

Logs user critiques or feedback directly into the system for model improvement.

Query Prompts

Pulls detailed log tracing of prompts and the associated rate limits used.

List Properties

Identifies active authentication arrays used by the gateway for access control.

Query Requests

Identifies all bounded client-server records that passed through the platform gateway.

Query Sessions

Counts and organizes structured rules related to billing and usage periods.

Query Users

Checks system history to validate which users are interacting with the platform.

Get Prompt Versions

Retrieves historical versions of a prompt, allowing you to compare changes over time.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Claude AI

1

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

2

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

3

Start a conversation

Open a new chat. The Helicone Observability integration is available immediately — no restart needed.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

  • Import from OpenAPI, Swagger, or YAML specs
  • Create Agent Skills with progressive disclosure
  • Deploy to edge with MCPFusion framework
  • Built in DLP, auth, and compliance on every call
  • Real time usage dashboard and cost metering
  • Publish to catalog or keep private
Start building

Make Your AI Do More

Start with Helicone (LLM Observability), then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

  • Use this MCP plus 5,100+ others, all in one place
  • Add new capabilities to your AI anytime you want
  • Every connection is secured and compliant automatically
  • Track usage and costs across all your servers
  • Works with Claude, ChatGPT, Cursor, and more
  • New servers added to the catalog every week
Helicone Observability MCP server cover

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Helicone. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 10 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Sifting through logs and spreadsheets for every AI metric is exhausting.

Right now, if your agent acts weird or the bill arrives higher than expected, you're stuck. You have to jump into a dashboard, pull up the log service, cross-reference timestamps with billing reports, and maybe check an outdated Git branch for the prompt version. It takes hours of clicking and copy-pasting just to answer: 'What went wrong?'

With this MCP, you talk to your agent like it's a helpful teammate. Instead of navigating multiple services, you ask natural questions—like 'Where did we spend most on Claude last week?'—and the agent instantly aggregates all that data for you.

Better control over prompt versions using `get_prompt_versions`

Before this, if a prompt change broke something, you were manually tracing through commit history and hoping the old version was still backed up somewhere. You had no easy way to compare exactly what instructions were active last month versus today's rules.

Now, when things break or you want to prove performance improvements, you simply ask your agent to run `get_prompt_versions`. It shows you every recorded change and the exact text of past versions, letting you rollback logic without touching code.

What your AI can actually do with this

Running an AI application means managing complexity, especially around cost and performance. This MCP gives you total control over that mess. Instead of hopping between billing portals and log viewers, you just ask your agent questions about its own activity. You can find out exactly how much money the system burned yesterday, or pinpoint which LLM provider is causing a latency spike during peak hours.

It even lets you trace complex multi-step workflows to see exactly where an agent failed or slowed down. If you're already using Vinkius for other services, adding this MCP means all your AI infrastructure data lives in one place—right inside your conversation.

Built · Hosted · Managed by Vinkius Helicone Observability MCP - Track LLM Costs & Latency
Server ID 019d75af-3782-7271-8c2e-071c1a2f6ce4
Vinkius Inspector
Compliance Grade A+
Score 100/100
Vinkius Inspector Badge — Score 100/100

Questions you might have

How do I check my spending using query_costs? +

You ask your agent to run query_costs. It immediately provides a structural breakdown of your current LLM expenditures, letting you see exactly which models and features are driving the most charges.

Can I use query_latency to find performance issues? +

Yes. Running query_latency measures Time To First Token (TTFT) and average speed across all calls, helping you pinpoint exactly which upstream LLM provider is slowing things down.

What does query_sessions do for debugging? +

query_sessions allows the agent to enumerate structured rules exporting active billing data. It's crucial for tracing multi-step workflows and seeing how an agent progressed through its tasks.

How do I check if a user is valid with query_users? +

You ask the agent to run query_users. This dispatches a validation check, confirming which clients have interacted with your system and ensuring you're tracking usage from all sources.

How do I use get_prompt_versions to audit a prompt's instruction text? +

It fetches the exact historical versions of your prompts. You can compare changes, see when grounding rules were updated, and pinpoint exactly what instructions the model received at any given time.

What does query_prompts retrieve about the API inputs? +

It retrieves detailed logs of every prompt sent to your LLM APIs. You can inspect these explicit prompts and outputs directly from your agent, which is key for debugging complex workflows.

How do I use log_feedback to gather user critique data? +

Using log_feedback captures user ratings like thumbs up or down. This logged data is crucial for offline Human-in-the-Loop evaluation and improving model grounding over time.

What information does query_requests provide about my API usage? +

This tool identifies bounded records of every single request made through your gateway. It gives a comprehensive view of activity, letting you monitor the total volume and context of all interactions.

Can I see the exact prompt that caused a specific error? +

Yes. Use the query_requests tool to fetch direct prompts and outputs from the proxy logs. You can filter by status or custom tags to find the exact interaction that needs debugging.

How do I track costs for a specific customer ID? +

Ask your agent to query_costs and include your customer identity in the filter. Helicone maps costs per model and user, allowing you to see exactly how much each client is burning in LLM tokens.

Can my agent log human feedback into Helicone? +

Absolutely. Use the log_feedback tool to inject offline Human-in-the-Loop verdicts or text critiques directly into Helicone's database, helping you refine your model's grounding over time.

Built & Managed by Vinkius 30s setup 10 tools

We've already built the connector for Helicone Observability. Just plug in your AI agents and start using Vinkius.

No hosting. No infrastructure. No complex setup.
All 10 tools are live and waiting. You're up and running in seconds.

Vinkius runs on Claude Claude
Vinkius runs on ChatGPT ChatGPT
Vinkius runs on Cursor Cursor
Vinkius runs on Gemini Gemini
Vinkius runs on Windsurf Windsurf
Vinkius runs on VS Code VS Code
Vinkius runs on JetBrains JetBrains
Vinkius runs on Vercel Vercel
+ other MCP clients

Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.

Zero hosting required Full MCP catalog included Enterprise-grade security Auto-updated by Vinkius

Built, hosted, and secured by Vinkius. You just connect and go.