Weights & Biases MCP for AI. Track model metrics and artifacts via chat.

Q: How do I check my project list using the Weights & Biases MCP?

You call listwandbprojects. This gives you a clean, simple rundown of every single project folder within your account. It's the best place to start when you don't know where to look.

Q: Can I use listprojectartifacts to see my datasets?

Yes, listprojectartifacts shows all versioned items in a project. It's how you track data lineage—knowing exactly which dataset version trained your model.

Q: How can I compare different training runs with this MCP?

Start by using listprojectruns to get all run IDs, then use the getrundetails tool on each ID you want to compare. The agent summarizes these details for you.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

Weights & Biases lets you manage your entire machine learning lifecycle through chat. Track model experiments, monitor real-time training runs, and version control artifacts like datasets and trained models—all without leaving your AI client.

What your AI can do

Get run details

Retrieves the full metrics and configuration for one particular run ID.

List project artifacts

Lists all datasets, models, or files versioned within a project.

List wandb projects

Lists every single project folder associated with your account.

+ 3 more capabilities included

List all projects

See every project folder within your WandB account to start browsing experiments.

Track specific runs

Retrieve a list of individual experiment attempts, showing their status and basic details.

Get run metrics

Fetch the full summary, including final accuracy, loss values, and hyperparameters for one specific training run.

Find project artifacts

List all versioned assets—like datasets or model checkpoints—associated with a given project.

Monitor hyperparameter sweeps

View the progress and results of automated searches that test different combinations of settings.

Access analysis reports

Retrieve a list of saved, collaborative documents and dashboards for project review.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

Weights & Biases: 6 Tools for Experiment Tracking

Use these tools to list projects, track specific run metrics, monitor hyperparameter sweeps, and manage model artifacts.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Weights & Biases on Vinkius

Get Run Details

Retrieves the full metrics and configuration for one particular run ID.

List Project Artifacts

Lists all datasets, models, or files versioned within a project.

List Wandb Projects

Lists every single project folder associated with your account.

List Project Reports

Fetches a list of saved, collaborative analysis documents for review.

List Project Runs

Gets a list of all individual training attempts within a specific project.

List Project Sweeps

Shows the progress and results of automated hyperparameter search tests.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Weights & Biases integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "weights-biases": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Weights & Biases tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"weights-biases": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Weights & Biases, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Weights & Biases. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 6 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

The painful way of checking ML performance history

Today, diagnosing a poor run requires an archaeological dig. You open the dashboard, click on Project A, find Run 42, and copy its hyperparameters into a spreadsheet. Then you have to manually jump over to the Artifacts tab to see which version of the dataset was used for that specific attempt. If you're comparing two runs, you do this entire process twice, copying six different sets of IDs just to confirm lineage.

With this MCP, all that manual clicking and copy-pasting disappears. You ask your agent a single question—for example, 'Compare the metrics between the last successful run and the one before it.' The answer is compiled instantly, providing both the performance data from `get_run_details` and confirming the related artifacts via `list_project_artifacts`. It's just conversation.

Get Model Metrics with get_run_details

Before, you had to navigate deep into a run's dedicated page, find the performance chart, and then scroll through the config panel just to grab the learning rate. It was slow work.

Now, tell your agent: 'Get the final accuracy and config for run ID X.' You get that specific data point delivered immediately in plain text. No clicking required; you just ask.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

You're running complex ML pipelines. You need to know if the latest change in hyperparameters actually hurt performance or if it was just a random fluctuation. This MCP connects directly to your Weights & Biases account, turning deep dashboard diving into simple conversation. Instead of manually filtering through dozens of runs and checking version numbers across separate tabs, you talk to your agent.

It finds the specific metrics—like final accuracy or loss curves—you need for any given run. You can also pull down all related artifacts, like the dataset version used or the model weights created, ensuring data lineage is always clear. The whole process stays secure; Vinkius ensures that every tool call generates a cryptographically signed audit trail, so you always know exactly what metrics flowed through and how your budget was spent.

It’s about getting actionable answers instantly, making your AI agent an actual ML research assistant.

Built · Hosted · Managed by Vinkius Weights & Biases MCP - Track ML Experiments

Server ID 019d761e-f403-7114-a2eb-cbfdb39ba9eb

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

What Changes When You Connect

Need to compare runs? You can use list_project_runs to get a list of all attempts, then use get_run_details on any specific run ID for its full metric summary—accuracy, loss, config. It keeps you from manually opening 50 tabs.

Data provenance is critical. If you need proof of what data trained your model, call list_project_artifacts. This shows every versioned dataset and model checkpoint associated with the project.

Automated search tracking used to mean checking a massive dashboard. Now, use list_project_sweeps to monitor hyperparameter optimization progress directly through chat.

You don't want to start from scratch every time. Use list_wandb_projects first to see all your work across different areas of research before diving into any single project.

Need a full historical picture? You can also use list_project_reports to pull up saved analysis and collaborative dashboards, linking documentation directly to the underlying results.

See it in action

01 01

Diagnosing performance regression

A user notices model accuracy dropped from 0.95 to 0.82. Instead of manually checking logs, they ask their agent to run list_project_runs for the project. They then use get_run_details on the pre-drop and post-drop runs side-by-side. The agent immediately points out a subtle change in the learning rate configured in the hyperparameters.

02 02

Reproducing old results

A scientist wants to reproduce a paper's findings. They ask their agent about the artifacts for the 'baseline-model' project, calling list_project_artifacts. The agent provides the exact version ID of the dataset and model weights needed, ensuring perfect reproducibility.

03 03

Reviewing team progress

A research lead needs to check on 10 different ongoing experiments. They use list_project_sweeps to see which automated searches are running and get a quick summary of optimization progress, without having to log into the platform's web UI.

04 04

Auditing project scope

A new team member joins and needs to know what projects exist. They simply ask the agent to call list_wandb_projects, getting a complete, current list of all work done by the team.

The honest tradeoffs

Checking data lineage manually

Anti-pattern

Copying model IDs from one tab and cross-referencing them with dataset versions on another page to confirm they match.

The Fix

Use the agent. First, call list_project_artifacts to see all available assets. Then, use a single query to verify that specific artifacts were used in a run by referencing their names or version IDs.

Forgetting project scope

Anti-pattern

Trying to find metrics for 'Project Alpha' but getting confused because the account has 15 projects, and they don't know which one to start with.

The Fix

Always start by calling list_wandb_projects first. This grounds your query and ensures you are only looking at runs within the correct scope.

Missing critical run context

Anti-pattern

Asking for 'the best performance' without knowing if that metric was measured after 50 epochs or 100. The answer is incomplete.

The Fix

Use get_run_details to force the agent to provide specific metrics, like 'What was the final accuracy and loss on run ID X?' This forces concrete data rather than vague summaries.

Questions you might have

How do I check my project list using the Weights & Biases MCP? +

You call list_wandb_projects. This gives you a clean, simple rundown of every single project folder within your account. It's the best place to start when you don't know where to look.

What does get_run_details do for my ML experiment? +

It pulls all the summary metrics and configuration details for a single run ID. This is essential if you need precise data points like loss curves or final hyperparameter values.

Can I use list_project_artifacts to see my datasets? +

Yes, list_project_artifacts shows all versioned items in a project. It's how you track data lineage—knowing exactly which dataset version trained your model.

How can I compare different training runs with this MCP? +

Start by using list_project_runs to get all run IDs, then use the get_run_details tool on each ID you want to compare. The agent summarizes these details for you.

How does using `list_project_sweeps` help me track automated hyperparameter searches? +

It lists all ongoing or completed optimization sweeps within a project. This lets you see how your model performed while automatically adjusting parameters like learning rate and batch size.

What is the purpose of using `list_project_reports` in my ML workflow? +

It gathers all saved analysis reports and dashboards created within a project. This feature helps research teams access pre-compiled, collaborative documentation about model performance.

If I need to know the exact parameters used for an experiment, how do I use `get_run_details`? +

The tool retrieves full run details, including the precise configuration and hyperparameters used when the training ran. This is crucial for reproducing results or debugging model behavior.

How can I track data lineage by using `list_project_artifacts`? +

It lists all versioned assets in a project, such as specific datasets and trained models. You can trace dependencies to ensure that every artifact you use is tied to its correct source version.

Can I check the latest metrics for a specific ML run? +

Yes. Using the get_run_details tool, your AI agent can pull the latest logged metrics (like accuracy or loss) and hyperparameters for any specific run ID within your projects.

Is it possible to list versioned datasets and models? +

Absolutely. The list_project_artifacts tool allows you to see all artifacts, including datasets and models, helping you track data lineage and versioning directly through conversation.

Can I monitor hyperparameter search sweeps via chat? +

Yes. Use the list_project_sweeps tool to monitor automated optimization tasks. Your agent will return a list of sweeps in the project so you can track progress without leaving your workspace.

View all recipes →

Fine-Tune AI Models Using MCP Servers

GPT-4 costs $30 per 1M tokens for your classification task , fine-tune a $0.20/M model on Together AI that scores 96% accuracy, track every experiment in W&B, and save $29.80 per million tokens

Together Ai Weights Biases Google Sheets

View all recipes

Connect to your AI in seconds.

Get run details

List project artifacts

List wandb projects

Weights & Biases: 6 Tools for Experiment Tracking

Make your AI actually useful.

Get Run Details

List Project Artifacts

List Wandb Projects

List Project Reports

List Project Runs

List Project Sweeps

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

The painful way of checking ML performance history

Get Model Metrics with get_run_details

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Diagnosing performance regression

Reproducing old results

Reviewing team progress

Auditing project scope

The honest tradeoffs

Checking data lineage manually

Forgetting project scope

Missing critical run context

When It Fits, When It Doesn't

Questions you might have

Powerful workflows you can unlock today

Fine-Tune AI Models Using MCP Servers