LangSmith MCP for AI. See exactly how your AI agent runs.
Works with every AI agent you already use
…and any MCP-compatible client








Connect to your AI in seconds.
LangSmith gives you full visibility into your LLM applications. Use this MCP to track performance, debug agent runs, and see exactly where your AI workflows break down.
It gathers aggregate metrics for projects and lets you deep-dive into every step of a single run—essential for any engineer building complex AI systems.
What your AI can do
Langsmith get run
Retrieves full execution details and inputs/outputs for a single, specific run ID.
Langsmith list projects
Lists all your tracing projects with key metrics like total runs, median latency, and feedback counts.
Langsmith list runs
Shows a list of recent traces across a project, detailing status, type (LLM/chain/tool), and token usage.
Get aggregate data about a group of related traces, including total runs and average latency.
Browse all the latest completed or failed agent actions, showing status and token usage for quick checks.
Retrieve every input, output, and timing detail for one specific run to pinpoint the failure point.
Ask an AI about this
Waiting for input…
LangSmith: 3 Tools for Observability
These three tools let you query project metrics, list recent activity, or get the full execution trace of any specific AI workflow run.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using LangSmith on VinkiusLangsmith Get Run
Retrieves full execution details and inputs/outputs for a single, specific run ID.
Langsmith List Projects
Lists all your tracing projects with key metrics like total runs, median latency...
Langsmith List Runs
Shows a list of recent traces across a project, detailing status, type...
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with LangSmith, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,100+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by LangSmith. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This connection provides 3 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.
Debugging complex LLM pipelines feels like detective work.
Right now, when an agent fails in production, your team has to jump through hoops. You're copying run IDs from one dashboard, checking them against logs in a second system, and hoping you can manually stitch together what happened between the tool calls. It takes hours just to figure out if it was bad data input or a faulty model call.
With this MCP, your agent handles the detective work for you. You don't copy-paste IDs anymore; you simply query project metrics using langsmith_list_projects and then drill down with langsmith_get_run. It gives you one centralized view of what happened.
LangSmith MCP provides granular insights into every step.
The biggest time sink disappears when your agent uses langsmith_list_runs to give you a chronological list. You instantly see which steps succeeded, which failed, and how many tokens each part consumed—all without touching a single log file or dashboard tab.
This means you spend zero time gathering evidence and 100% of your time actually fixing the problem.
What your AI can actually do with this
When you're running LLMs or multi-step agents, the execution path can feel like a black box. LangSmith changes that. This connector gives your agent the ability to monitor and debug those tricky workflows in real time. You can look at an entire project's health, checking metrics like median latency or total runs across dozens of models.
If something goes wrong, you don't have to guess; you can get a full trace for any specific run, seeing every input and output. This whole system—listing projects, viewing recent activity, and getting detailed run reports—is all available through Vinkius, letting your AI client manage the complexity for you.
019d75c4-ac77-70c7-82b7-491183d5e946 Here's how it actually works
The bottom line is: your agent can now inspect complex AI code paths without needing manual API calls.
Subscribe to this MCP and enter your LangSmith API key.
Your agent uses the connection to monitor LLM calls and agent actions as they happen in production.
The AI client then provides you with structured data, letting you query project metrics or specific run details.
Who is this actually for?
AI Engineers and ML Ops teams use this MCP when they need to move beyond simple logging. If you're tired of debugging production failures by manually checking logs across three different services, this is for you.
When a model starts behaving weirdly in staging, they use this to compare current outputs against historical runs and pinpoint the exact regression.
They set up alerts on error rates or latency spikes using project metrics before an issue ever hits production. They monitor cost anomalies too.
When a new feature is deployed, they check the median latency across projects to make sure performance hasn't degraded for users.
What Changes When You Connect
Pinpoint failures fast. Instead of just knowing a run failed, you can use langsmith_get_run to see the full stack trace, including which tool timed out and why.
Track performance over time. Use langsmith_list_projects to compare aggregate metrics like median latency across different versions or environments.
Stay ahead of regressions. Quickly list recent activity with langsmith_list_runs to spot a sudden spike in token usage or an unexpected increase in error statuses.
Isolate model issues. By viewing project and run types, you can easily determine if the slowdown is due to a specific LLM call versus a complex agent action (chain).
Improve reliability. The ability to see associated feedback metrics helps your team prioritize which parts of the workflow need debugging first.
See it in action
The user reports inconsistent answers.
An AI Engineer suspects a specific tool is failing intermittently. They use langsmith_list_runs to filter for failed runs, then use langsmith_get_run on those IDs. This immediately reveals the exact input data that caused the timeout, allowing them to fix the upstream logic.
The team needs to compare two model versions.
An ML Team wants to know if Model B is better than Model A. They use langsmith_list_projects to group results by model version and review the median latency data, making a data-driven decision on which one to deploy.
A new feature adds unexpected cost.
DevOps notices an unexplained spike in monthly costs. They check langsmith_list_projects for the 'total runs' metric and use langsmith_list_runs to trace back to the specific agent workflow that is running too frequently.
The overall AI pipeline slows down post-deployment.
An engineer notices performance dipping. They check the project metrics using langsmith_list_projects, which shows a sudden rise in latency compared to yesterday's baseline, directing them straight to the bottleneck.
The honest tradeoffs
Treating logs like source truth
Assuming that simple application logging (e.g., 'Success') tells you enough about why an agent failed, which is often not the case.
You need structured tracing data. Use langsmith_list_runs to check the status and then use langsmith_get_run to get the full execution trace that explains why it failed.
Only checking the final output
When an agent gives a bad answer, just re-running the whole process without knowing where it went wrong.
Use langsmith_get_run to inspect the inputs and outputs of intermediate steps. This lets you see if the error happened in the LLM call or a specific tool.
Manually aggregating performance data
Exporting hundreds of individual run logs into a spreadsheet just to calculate median latency, which is tedious and prone to errors.
Let langsmith_list_projects handle the heavy lifting. It provides aggregate metrics like total runs and median latency directly in your client.
When It Fits, When It Doesn't
Use this MCP if you need to debug complex, multi-step agent flows. If you just want to see a simple 'Did it work?' status, standard logging is fine. But when the process fails—if an LLM call times out or a tool returns unexpected data—you need deep visibility. LangSmith shines here because it separates listing projects (for high-level metrics), listing runs (for recent history), and getting detailed run information (for root cause analysis). Don't use this if you only care about the final text output; use it when you need to know how that text was generated.
Questions you might have
How do I check overall performance with langsmith_list_projects? +
You use langsmith_list_projects to get a summary table. It shows aggregate metrics like median latency and total runs across your entire project group, letting you gauge overall health instantly.
What is the difference between langsmith_list_runs and langsmith_get_run? +
langsmith_list_runs gives you a list of recent attempts (the 'what'). langsmith_get_run requires a specific ID to give you the full, deep-dive trace that shows every single input and output from that run.
Can I use LangSmith MCP for simple logging? +
No. This MCP is built for tracing complex flows. If your task is just sending a message or updating one record, you don't need this; it handles the complexity of multi-step AI execution.
How do I track performance across my whole app? +
You start with langsmith_list_projects. This tool groups all your related traces and provides those aggregate metrics that let you compare project health at a glance.
How do I analyze the full details of a specific failed run using langsmith_get_run? +
It provides a complete, deep dive into that single execution. You'll see the entire trace flow, including all inputs and outputs, which lets you pinpoint exactly where the agent ran into an error or unexpected behavior.
Can I use langsmith_list_runs to filter traces by specific types, like only 'tool' calls? +
Yes. The tool lists runs and allows filtering by type (LLM, chain, or tool). This is useful because you can isolate the performance data for just one component of your larger agent workflow.
What does langsmith_list_projects show regarding project setup and scope? +
It gives a high-level dashboard view of all tracing projects in your account. You immediately get aggregate metrics like total runs, median latency, and feedback scores across entire groups of related traces.
How can I track token usage or specific performance timings using this MCP? +
Every run recorded by the MCP tracks these core metrics. You see both token counts and precise timing data for every step, whether it's an LLM call or a complex chain execution.
What is LangSmith and why do I need it? +
LangSmith is the 'Datadog for LLM applications'. Without observability, AI agents in production are black boxes — you can't see what they're doing, why they fail, or how much they cost. LangSmith traces every LLM call, chain execution, and tool use, giving you complete visibility into inputs, outputs, latency, token usage, and error rates.
Does LangSmith work only with LangChain? +
No! While LangSmith is built by the LangChain team and has native LangChain/LangGraph integration, it works with any LLM application. You can trace OpenAI, Anthropic, or any LLM provider directly using the REST API. It also integrates with CrewAI, AutoGen, and other frameworks.
How much does LangSmith cost? +
LangSmith offers a generous free tier with 5,000 traces per month — no credit card required. The Developer plan is $39/month with 50,000 traces. Enterprise plans include SSO, RBAC, dedicated support, and unlimited traces with volume discounts.
We've already built the connector for LangSmith. Just plug in your AI agents and start using Vinkius.
No hosting. No infrastructure. No complex setup.
All 3 tools are live and waiting.
You're up and running in seconds.
Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.
Built, hosted, and secured by Vinkius. You just connect and go.