Comet ML MCP for AI. Audit model metrics and track every experiment detail.

Q: Do I need to listprojects before listingworkspaces?

Yes. The hierarchy works top-down. You use listprojects first to define the main organizational area, and then you can call listworkspaces within that project's scope.

Q: Can I check what hyperparameters were used for a model?

Absolutely. Just ask your agent to use getexperimentparams. It will pull the explicit ML properties, like the learning rate and optimizer, that defined that specific run.

Q: What is the difference between listexperiments and getexperiment?

listexperiments shows you an array of many runs in a workspace. getexperiment lets you drill down to pull all the detailed data from one single, specific run.

Q: How do I confirm my API key is active using listworkspaces?

You run listworkspaces. The tool validates your credentials by returning a structured array of top-level organizational spaces. This confirms the connection works before you query specific projects or experiments.

Q: Can I limit the results when running listexperiments?

Yes, you pass specific filtering parameters to listexperiments. You can specify criteria like date ranges or status codes, so your agent only returns the exact experiment IDs relevant to your task.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

Comet ML connects your agent directly to your machine learning research data. You can audit model performance, check specific run parameters, and navigate complex project structures—all by talking to your AI client.

Stop leaving the chat window; keep your entire MLOps workflow running right where you are.

What your AI can do

List workspaces

Finds smaller, grouped sections of experiments within a larger project area.

List projects

Identifies the primary organizational buckets where your ML research lives inside Comet.

List experiments

Discovers an array of all logged experiments within a specified workspace or project.

+ 3 more capabilities included

Audit Model Run Performance

Pull high-precision numerical metrics—like accuracy or loss—that were generated during the training cycle.

Inspect Training Configurations

Extract explicit ML properties, such as batch size and learning rates, used for a specific model run.

Map Project Hierarchy

Navigate the entire organizational structure by listing available projects and workspaces within Comet ML.

Review Experiment Metadata

List and review details about specific model runs, including performance tags and status updates.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

Comet ML: 6 Tools for MLOps Auditing

These tools give your agent the power to map out projects, list runs, and pull deep-dive metrics and parameters from your entire Comet ML account.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Comet ML on Vinkius

List Workspaces

Finds smaller, grouped sections of experiments within a larger project area.

List Projects

Identifies the primary organizational buckets where your ML research lives inside...

List Experiments

Discovers an array of all logged experiments within a specified workspace or project.

Get Experiment

Retrieves detailed information about a specific model run using its unique ID.

Get Experiment Metrics

Calculates and returns time-series data for defined numeric metrics, like loss or...

Get Experiment Params

Inspects the specific hyperparameters—like learning rates—that were used to train a model.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Comet ML integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "comet-ml": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Comet ML tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"comet-ml": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Comet ML, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Comet ML. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 6 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

The manual process of tracking model progress sucks time.

Right now, checking on an experiment's health means copy-pasting data. You check the dashboard for accuracy, then open a separate tab to see if the learning rate was correct, and finally, you paste those two numbers into a spreadsheet to compare against another model's results. It’s slow, it introduces friction, and frankly, it takes too much context switching.

With this MCP, all that happens in one conversation. You talk to your agent, asking specific questions like, 'What was the loss on Model X when its batch size was 32?' The agent runs the necessary calls behind the scenes using tools like `get_experiment_params` and `get_experiment_metrics`, then hands you a clean answer immediately.

Comet ML MCP: Get full visibility into your model lifecycle.

The biggest time drain is manually navigating the project structure. You spend minutes clicking through organizational names, trying to remember if that research lives in 'Q3/A' or 'Research/Team Alpha'.

Now, you just ask. The agent uses tools like `list_projects` and `list_workspaces` to map out your entire ML portfolio instantly. You gain immediate context on where everything is stored. It’s a massive time saver.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

Managing an ML experiment used to mean jumping between a dashboard, a terminal, and a spreadsheet just to track one metric. This MCP lets you take full control of that lifecycle conversationally. You can ask your AI client for performance data across different runs or pull out specific hyperparameters that were used during training without ever leaving the chat window.

It's designed for deep analysis: listing every project in an organization, finding all associated workspaces, and then pulling detailed metrics for any single run you need to audit. When you connect it via Vinkius Marketplace, your agent gains instant access to this whole catalog of ML data tools, making complex audits as simple as asking a question.

Built · Hosted · Managed by Vinkius Comet ML MCP - Audit & Track Model Metrics

Server ID 019d7578-3214-737e-bdc0-d8ba581285b6

Vinkius Inspector

Compliance Grade A+

Score 98.33/100

Report View Report ↗

Who is this actually for?

This MCP is built for engineers who spend too much time switching between tabs. It's for the ML Engineer tired of manually copying metrics from one dashboard to another, and the Data Scientist who needs instant context on why a model failed without leaving their chat client.

Machine Learning Engineer

Verifies training configurations by checking parameters using get_experiment_params or auditing performance metrics with get_experiment_metrics.

Data Scientist

Compares model metrics across multiple trials and navigates different workspaces to keep track of research progress.

MLOps Specialist

Monitors the completion status of large-scale, active evaluations by listing all relevant experiments via list_experiments.

What Changes When You Connect

You don't need to open the web UI. By using this MCP, you can list all projects with list_projects and immediately scope your audit within your chat client.

Debugging a failed run is faster than ever. Instead of guessing what went wrong, ask for parameters, and use get_experiment_params to instantly check the exact learning rates used.

Comparing model performance across multiple runs? Use list_experiments first to see all trials, then call get_experiment_metrics on each one to get clean data points for comparison.

Navigating massive ML research portfolios is simple. You can scope down your search by calling list_workspaces, which narrows the focus from an entire project.

Real-time monitoring becomes conversational. When you need to know if a long-running job is done, just ask about its status, and the MCP handles the heavy lifting.

See it in action

01 01

Identifying the source of model drift

A data scientist notices their production model performance dropped last week. They use the agent to call list_experiments for that time window, narrowing down the failing run ID. Then they call get_experiment_metrics on that specific ID to pull loss curves and pinpoint exactly when the performance started degrading.

02 02

Verifying a competitor's claimed baseline

An ML Engineer needs to replicate a reported benchmark. They use the MCP to call list_projects to find the correct research area, then check specific configuration details using get_experiment_params to ensure they are matching the exact batch size and optimizer used.

03 03

Organizing massive project data

An MLOps team is onboarding a new researcher. They ask the agent, 'Show me all research areas for the Q3 rollout.' The MCP first calls list_projects and then uses list_workspaces to provide a complete map of where all related experiments are stored.

04 04

Debugging unexpected run failures

A researcher runs an experiment that times out. They use the agent's capability to get the full experiment details via get_experiment, reviewing the logs and structural configurations to understand why the job failed before rewriting the code.

The honest tradeoffs

Asking for 'all data'

Anti-pattern

A user asks, 'Give me everything about my model.' The agent fails because it doesn't know if you mean metrics, parameters, or just the project name.

The Fix

Break it down. Start by calling list_projects to define scope. Then call list_workspaces to narrow that down. Finally, use specific tools like get_experiment_metrics for the data you actually need.

Assuming a single command works

Anti-pattern

A user tries to get all metrics and parameters in one go: 'Show me everything.' This is vague and returns nothing useful.

The Fix

Use distinct tools for different data types. To check configs, run get_experiment_params. To see performance over time, use get_experiment_metrics.

Missing the project context

Anti-pattern

A user asks for metrics without specifying which group of work they are in. The agent fails because it can't find the target.

The Fix

Always start by defining scope using list_projects to get the overarching organizational view before attempting any deep dives.

Questions you might have

How do I find all the metrics for an experiment using get_experiment_metrics? +

You must specify the exact experiment ID you want to audit. Then, ask your agent to execute get_experiment_metrics on that ID, and it will return the performance data over time.

Do I need to list_projects before listing_workspaces? +

Yes. The hierarchy works top-down. You use list_projects first to define the main organizational area, and then you can call list_workspaces within that project's scope.

Can I check what hyperparameters were used for a model? +

Absolutely. Just ask your agent to use get_experiment_params. It will pull the explicit ML properties, like the learning rate and optimizer, that defined that specific run.

What is the difference between list_experiments and get_experiment? +

list_experiments shows you an array of many runs in a workspace. get_experiment lets you drill down to pull all the detailed data from one single, specific run.

How do I confirm my API key is active using list_workspaces? +

You run list_workspaces. The tool validates your credentials by returning a structured array of top-level organizational spaces. This confirms the connection works before you query specific projects or experiments.

What happens if I use an invalid ID with get_experiment? +

The call returns a precise API error message stating that the payload ID does not exist. Your agent passes this failure response directly to your client, letting you know exactly which experiment needs fixing.

Can I limit the results when running list_experiments? +

Yes, you pass specific filtering parameters to list_experiments. You can specify criteria like date ranges or status codes, so your agent only returns the exact experiment IDs relevant to your task.

Does get_experiment provide access to raw log traces? +

Yes, this tool retrieves detailed cloud logging traces associated with a specific payload ID. This lets your agent analyze low-level system events that aren't summarized in the standard metrics.

Can my agent retrieve real-time metrics from an active ML run? +

Yes. Use the 'get_experiment_metrics' tool with the experiment key. The agent will pull the latest numeric logged endpoints, allowing you to monitor loss, accuracy, and other custom metrics as they are generated.

How do I audit the parameters used in a specific experiment? +

Provide the experiment key to your agent. The 'get_experiment_params' tool extracts all logged ML properties, helping you verify hyperparameters like learning rates, batch sizes, and model architectures.

Can I see a list of all experiments within a specific project? +

Absolutely. Use the 'list_experiments' tool with the project ID. Your agent will surface all ML runs within that project, including their status and metadata, so you can quickly identify the results you need.

Connect to your AI in seconds.

List workspaces

List projects

List experiments

Comet ML: 6 Tools for MLOps Auditing

Make your AI actually useful.

List Workspaces

List Projects

List Experiments

Get Experiment

Get Experiment Metrics

Get Experiment Params

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

The manual process of tracking model progress sucks time.

Comet ML MCP: Get full visibility into your model lifecycle.

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Identifying the source of model drift

Verifying a competitor's claimed baseline

Organizing massive project data

Debugging unexpected run failures

The honest tradeoffs

Asking for 'all data'

Assuming a single command works

Missing the project context

When It Fits, When It Doesn't

Questions you might have