Comet ML MCP for AI. Audit model metrics and track every experiment detail.
Works with every AI agent you already use
…and any MCP-compatible client








Connect to your AI in seconds.
Comet ML connects your agent directly to your machine learning research data. You can audit model performance, check specific run parameters, and navigate complex project structures—all by talking to your AI client.
Stop leaving the chat window; keep your entire MLOps workflow running right where you are.
What your AI can do
List workspaces
Finds smaller, grouped sections of experiments within a larger project area.
List projects
Identifies the primary organizational buckets where your ML research lives inside Comet.
List experiments
Discovers an array of all logged experiments within a specified workspace or project.
Pull high-precision numerical metrics—like accuracy or loss—that were generated during the training cycle.
Extract explicit ML properties, such as batch size and learning rates, used for a specific model run.
Navigate the entire organizational structure by listing available projects and workspaces within Comet ML.
List and review details about specific model runs, including performance tags and status updates.
Ask an AI about this
Waiting for input…
Comet ML: 6 Tools for MLOps Auditing
These tools give your agent the power to map out projects, list runs, and pull deep-dive metrics and parameters from your entire Comet ML account.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using Comet ML on VinkiusList Workspaces
Finds smaller, grouped sections of experiments within a larger project area.
List Projects
Identifies the primary organizational buckets where your ML research lives inside...
List Experiments
Discovers an array of all logged experiments within a specified workspace or project.
Get Experiment
Retrieves detailed information about a specific model run using its unique ID.
Get Experiment Metrics
Calculates and returns time-series data for defined numeric metrics, like loss or...
Get Experiment Params
Inspects the specific hyperparameters—like learning rates—that were used to train a model.
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Comet ML, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,100+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Comet ML. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This connection provides 6 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.
The manual process of tracking model progress sucks time.
Right now, checking on an experiment's health means copy-pasting data. You check the dashboard for accuracy, then open a separate tab to see if the learning rate was correct, and finally, you paste those two numbers into a spreadsheet to compare against another model's results. It’s slow, it introduces friction, and frankly, it takes too much context switching.
With this MCP, all that happens in one conversation. You talk to your agent, asking specific questions like, 'What was the loss on Model X when its batch size was 32?' The agent runs the necessary calls behind the scenes using tools like `get_experiment_params` and `get_experiment_metrics`, then hands you a clean answer immediately.
Comet ML MCP: Get full visibility into your model lifecycle.
The biggest time drain is manually navigating the project structure. You spend minutes clicking through organizational names, trying to remember if that research lives in 'Q3/A' or 'Research/Team Alpha'.
Now, you just ask. The agent uses tools like `list_projects` and `list_workspaces` to map out your entire ML portfolio instantly. You gain immediate context on where everything is stored. It’s a massive time saver.
What your AI can actually do with this
Managing an ML experiment used to mean jumping between a dashboard, a terminal, and a spreadsheet just to track one metric. This MCP lets you take full control of that lifecycle conversationally. You can ask your AI client for performance data across different runs or pull out specific hyperparameters that were used during training without ever leaving the chat window.
It's designed for deep analysis: listing every project in an organization, finding all associated workspaces, and then pulling detailed metrics for any single run you need to audit. When you connect it via Vinkius Marketplace, your agent gains instant access to this whole catalog of ML data tools, making complex audits as simple as asking a question.
019d7578-3214-737e-bdc0-d8ba581285b6 Here's how it actually works
The bottom line is that it turns complex, multi-step data retrieval into a single conversation.
Subscribe to this MCP on Vinkius and enter your Comet ML API Key (you'll find this in the platform’s Account Settings).
Your AI client uses the connection to access the data structure, allowing you to query specific organizational boundaries like projects or workspaces.
You ask a question—for instance, 'What were the metrics for experiment X?'—and your agent executes the necessary calls and returns clean, structured answers.
Who is this actually for?
This MCP is built for engineers who spend too much time switching between tabs. It's for the ML Engineer tired of manually copying metrics from one dashboard to another, and the Data Scientist who needs instant context on why a model failed without leaving their chat client.
Verifies training configurations by checking parameters using get_experiment_params or auditing performance metrics with get_experiment_metrics.
Compares model metrics across multiple trials and navigates different workspaces to keep track of research progress.
Monitors the completion status of large-scale, active evaluations by listing all relevant experiments via list_experiments.
What Changes When You Connect
You don't need to open the web UI. By using this MCP, you can list all projects with list_projects and immediately scope your audit within your chat client.
Debugging a failed run is faster than ever. Instead of guessing what went wrong, ask for parameters, and use get_experiment_params to instantly check the exact learning rates used.
Comparing model performance across multiple runs? Use list_experiments first to see all trials, then call get_experiment_metrics on each one to get clean data points for comparison.
Navigating massive ML research portfolios is simple. You can scope down your search by calling list_workspaces, which narrows the focus from an entire project.
Real-time monitoring becomes conversational. When you need to know if a long-running job is done, just ask about its status, and the MCP handles the heavy lifting.
See it in action
Identifying the source of model drift
A data scientist notices their production model performance dropped last week. They use the agent to call list_experiments for that time window, narrowing down the failing run ID. Then they call get_experiment_metrics on that specific ID to pull loss curves and pinpoint exactly when the performance started degrading.
Verifying a competitor's claimed baseline
An ML Engineer needs to replicate a reported benchmark. They use the MCP to call list_projects to find the correct research area, then check specific configuration details using get_experiment_params to ensure they are matching the exact batch size and optimizer used.
Organizing massive project data
An MLOps team is onboarding a new researcher. They ask the agent, 'Show me all research areas for the Q3 rollout.' The MCP first calls list_projects and then uses list_workspaces to provide a complete map of where all related experiments are stored.
Debugging unexpected run failures
A researcher runs an experiment that times out. They use the agent's capability to get the full experiment details via get_experiment, reviewing the logs and structural configurations to understand why the job failed before rewriting the code.
The honest tradeoffs
Asking for 'all data'
A user asks, 'Give me everything about my model.' The agent fails because it doesn't know if you mean metrics, parameters, or just the project name.
Break it down. Start by calling list_projects to define scope. Then call list_workspaces to narrow that down. Finally, use specific tools like get_experiment_metrics for the data you actually need.
Assuming a single command works
A user tries to get all metrics and parameters in one go: 'Show me everything.' This is vague and returns nothing useful.
Use distinct tools for different data types. To check configs, run get_experiment_params. To see performance over time, use get_experiment_metrics.
Missing the project context
A user asks for metrics without specifying which group of work they are in. The agent fails because it can't find the target.
Always start by defining scope using list_projects to get the overarching organizational view before attempting any deep dives.
When It Fits, When It Doesn't
Use this MCP if your primary pain point is retrieving, comparing, or auditing structured ML data (metrics, parameters) from multiple runs and projects. You need a conversational way to access complex MLOps metadata. Don't use it if you are trying to run code, manage credentials, or interact with external systems like databases—for that, look for dedicated database or scripting MCPs. If your goal is simply organizational charting without performance data, a basic resource inventory tool might suffice; but if metrics and parameters are key, this is your best bet.
Questions you might have
How do I find all the metrics for an experiment using get_experiment_metrics? +
You must specify the exact experiment ID you want to audit. Then, ask your agent to execute get_experiment_metrics on that ID, and it will return the performance data over time.
Do I need to list_projects before listing_workspaces? +
Yes. The hierarchy works top-down. You use list_projects first to define the main organizational area, and then you can call list_workspaces within that project's scope.
Can I check what hyperparameters were used for a model? +
Absolutely. Just ask your agent to use get_experiment_params. It will pull the explicit ML properties, like the learning rate and optimizer, that defined that specific run.
What is the difference between list_experiments and get_experiment? +
list_experiments shows you an array of many runs in a workspace. get_experiment lets you drill down to pull all the detailed data from one single, specific run.
How do I confirm my API key is active using list_workspaces? +
You run list_workspaces. The tool validates your credentials by returning a structured array of top-level organizational spaces. This confirms the connection works before you query specific projects or experiments.
What happens if I use an invalid ID with get_experiment? +
The call returns a precise API error message stating that the payload ID does not exist. Your agent passes this failure response directly to your client, letting you know exactly which experiment needs fixing.
Can I limit the results when running list_experiments? +
Yes, you pass specific filtering parameters to list_experiments. You can specify criteria like date ranges or status codes, so your agent only returns the exact experiment IDs relevant to your task.
Does get_experiment provide access to raw log traces? +
Yes, this tool retrieves detailed cloud logging traces associated with a specific payload ID. This lets your agent analyze low-level system events that aren't summarized in the standard metrics.
Can my agent retrieve real-time metrics from an active ML run? +
Yes. Use the 'get_experiment_metrics' tool with the experiment key. The agent will pull the latest numeric logged endpoints, allowing you to monitor loss, accuracy, and other custom metrics as they are generated.
How do I audit the parameters used in a specific experiment? +
Provide the experiment key to your agent. The 'get_experiment_params' tool extracts all logged ML properties, helping you verify hyperparameters like learning rates, batch sizes, and model architectures.
Can I see a list of all experiments within a specific project? +
Absolutely. Use the 'list_experiments' tool with the project ID. Your agent will surface all ML runs within that project, including their status and metadata, so you can quickly identify the results you need.
We've already built the connector for Comet ML. Just plug in your AI agents and start using Vinkius.
No hosting. No infrastructure. No complex setup.
All 6 tools are live and waiting.
You're up and running in seconds.
Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.
Built, hosted, and secured by Vinkius. You just connect and go.