Arize AI MCP for AI. Monitor model performance and data drift instantly.

Q: How does I use the ingestlog tool with Arize AI?

You pass a payload JSON structure to ingestlog. The agent handles structuring your raw telemetry logs into the valid format and pushing them directly to Arize for analysis.

Q: Can I list all monitored ML models with listmodels?

Yes, running listmodels retrieves a complete list of every tracked ML or LLM model in your current workspace, helping you narrow down where the issue is occurring.

Q: What's the difference between getting metrics and listing environments?

getmetrics gives quantitative data (performance scores, drift rates) for a specific model. listenvironments just shows you the names of available deployment contexts like 'Production' or 'Staging'.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

Arize AI connects your agent to ML observability. You monitor LLM performance, track model metrics, and check data drift right from your terminal or IDE.

It lets you ingest raw inference logs and run automated evaluations against static datasets without opening a dashboard. This is for engineers who need real-time visibility into their models.

What your AI can do

List datasets

Returns a list of all available static evaluation datasets for testing.

List environments

Lists configured deployment environments (like Production or Training) used to segment model data.

List evals

Shows a list of automated evaluation runs that have been executed against models.

+ 7 more capabilities included

Check Model Status

List all active ML models and retrieve their detailed configuration schemas.

Monitor Performance Metrics

Fetch current observability metrics, including performance scores and data quality reports for any tracked model.

Manage Data Inputs

List available static evaluation datasets or retrieve specific dataset metadata for testing purposes.

Track Live Data Streams

Push raw logs, predictions, and inferences into the platform for immediate visualization and drift analysis.

Control Environments

List configured deployment environments, such as Production or Verification, to ensure data segregation.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

Arize AI with 10 Tools

These tools let you interact with the entire Arize observability platform: list models, fetch performance metrics, manage datasets, and trigger automated model evaluations.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Arize AI on Vinkius

List Datasets

Returns a list of all available static evaluation datasets for testing.

List Environments

Lists configured deployment environments (like Production or Training) used to...

List Evals

Shows a list of automated evaluation runs that have been executed against models.

Get Dataset

Retrieves details for a specific static dataset used in evaluations.

Get Model

Gets metadata, inputs, and outputs for a specific tracked machine learning model.

Ingest Log

Accepts raw telemetry data (payload_json) and sends it into the Arize logging system.

Get Metrics

Fetches real-time observability metrics and performance scores for an ML model.

List Models

Lists all ML models or LLMs currently being tracked within the platform space.

Run Eval

Triggers an automated evaluation run for LLM checks using configured ground truth...

List Spaces

Returns a list of accessible workspaces, which separate different model telemetry...

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Arize AI integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "arize-ai": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Arize AI tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"arize-ai": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Arize AI, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Arize AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 10 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Tracking model behavior used to be a multi-tab headache.

Today, if you want to know why your LLM output dipped in quality, you're slammed. You jump into the Arize dashboard, find the right Space, pull up the correct Model, and then hunt through tabs for drift metrics or raw logs that explain the drop. It’s tedious, slow work.

With this MCP, the agent does it all. You just ask: 'What's wrong with Model X?' The system responds by fetching live metrics, checking data quality, and pointing you straight to the problem—no clicking required.

Get model status checks directly via `get_model`.

Before writing a single line of code that interacts with an ML service, manual steps included checking documentation and manually confirming the expected inputs and outputs. This was prone to human error.

Now, you simply ask the agent to run `get_model`. It gives you the full metadata in plain text, right where you're working. That’s how you eliminate boilerplate checks.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

You can connect this MCP to any agent client, giving it full access to your ML observability platform. Forget switching context into heavy graphical dashboards just to see if an LLM prompt hallucinated or if performance dipped. Now, your AI acts like a dedicated MLOps engineer talking to you in plain English.

Need to know what models are running? You can ask the agent to list all tracked ML models. Want to check data quality? It fetches real-time metrics and shows prediction drift flags. The system also lets you push raw logs, predictions, and inferences directly into Arize for immediate tracking using ingest_log.

For governance, you can browse organizational spaces and deployment environments via list_environments, keeping track of Production versus Training data.

Beyond monitoring, the agent handles testing. You can list automated evaluation runs or even trigger a custom check using run_eval against static datasets. It’s about making your ML telemetry workflow conversational; it just works.

Built · Hosted · Managed by Vinkius Arize AI MCP - Monitor ML Models & Data Drift

Server ID 019d7552-62cd-70d2-a1f4-cdbc8fc5e9e7

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

What Changes When You Connect

Stop context-switching. You don't have to leave your terminal or IDE just because you need to check get_metrics for prediction drift. Your agent does the heavy lifting, keeping your focus on coding.

Better governance means knowing where your data comes from. Use list_environments and list_spaces to separate Production telemetry from Training runs, which is critical for clean audits.

ingest_log allows you to push raw inference payloads programmatically. This guarantees that every piece of observed behavior gets tracked in Arize for later analysis.

When you need assurance on model output quality, the agent can list automated evaluation runs (list_evals) or even kick off a new check using run_eval against ground truth data.

The system provides deep visibility into your entire ML stack. You get to see everything from the initial schema definition via get_model all the way through live performance tracking.

See it in action

01 01

Debugging a Production Drift Spike

A user notices model accuracy dropped in production. Instead of diving into the UI, they ask their agent to check get_metrics for the specific model and then use list_environments to confirm if the issue is isolated to the active deployment space.

02 02

Setting up a New Evaluation Benchmark

A data scientist needs to test an LLM against new toxicity rules. They first run list_datasets to find available benchmarks, then use get_dataset to confirm the schema, and finally trigger the check with run_eval.

03 03

Capturing Live Inference Data

A developer writes a new feature that makes many calls. They don't want to manually record everything; they simply use ingest_log to push the entire payload stream, guaranteeing Arize sees every single prediction.

04 04

Auditing Model Readiness

A product manager needs proof that a model is stable before release. They ask the agent to list all active models (list_models), check its current performance metrics using get_metrics, and confirm it's running in a verified environment.

The honest tradeoffs

Manual Dashboard Clicking

Anti-pattern

A developer runs the same test five times, manually copying results from one dashboard tab to another for comparison.

The Fix

Use ingest_log repeatedly with your agent. This streams all payloads directly into Arize, giving you a single source of truth for comparison and historical analysis.

Forgetting Environment Context

Anti-pattern

A scientist runs an evaluation using production data when they meant to use the dedicated 'Training' environment.

The Fix

Always verify your boundaries. Use list_environments before any run, and confirm your workspace via list_spaces.

Assuming Model Schema is Static

Anti-pattern

The agent fails because the developer didn't realize the model had changed its required inputs or output fields.

The Fix

Always query the schema first. Run get_model to confirm the precise inputs, outputs, and features before attempting any action.

When It Fits, When It Doesn't

Use this MCP if your primary bottleneck is context-switching between development tools (like VS Code or a terminal) and observability platforms (Arize). You need a programmatic way to query performance metrics (get_metrics), track live data streams (ingest_log), and manage model lifecycles conversationally. Don't use this if you just need to view historical, static reports; for that, the native Arize UI is fine. If your goal is merely listing models without checking their health, list_models handles it, but combining it with get_metrics provides the real value.

Questions you might have

How does I use the ingest_log tool with Arize AI? +

You pass a payload JSON structure to ingest_log. The agent handles structuring your raw telemetry logs into the valid format and pushing them directly to Arize for analysis.

Can I list all monitored ML models with list_models? +

Yes, running list_models retrieves a complete list of every tracked ML or LLM model in your current workspace, helping you narrow down where the issue is occurring.

What's the difference between getting metrics and listing environments? +

get_metrics gives quantitative data (performance scores, drift rates) for a specific model. list_environments just shows you the names of available deployment contexts like 'Production' or 'Staging'.

Do I need to use run_eval if I want to test my LLM? +

No, not always. If you have a specific dataset and just need metrics, get_metrics might suffice. However, using run_eval triggers the formal evaluation process against ground truth baselines.

How do I use list_spaces to see all my available workspaces? +

It lists every organizational space you have access to in Arize. This lets your agent pinpoint exactly which model or telemetry dataset needs monitoring, keeping your work properly segmented.

What information does get_model need about my tracked ML model? +

The tool requires the specific name and ID of the model you are tracking. This confirms the metadata, defining all inputs, outputs, and features so your agent knows exactly what to monitor.

What does list_environments show me about my deployment stages? +

It shows defined contexts like Production, Training, or Verification. You can use this to restrict monitoring to a specific lifecycle stage, which is critical for accurate reporting before going live.

If I list_datasets, how do I get the details on a particular dataset using get_dataset? +

The tool retrieves all metadata for a specified dataset. You'll find immediate details like row counts, column names, and schema information without having to guess.

Can my AI automatically trigger a hallucination evaluation on a new dataset? +

Yes! You can ask your agent to retrieve the specific Ground Truth dataset ID, formulate a testing payload, and invoke the run_eval tool natively. Arize will process the asynchronous scoring internally and log the evaluation securely.

How can I quickly check if a production model is experiencing data drift? +

Just tell your agent: 'Fetch the primary metrics for model X'. The AI uses the get_metrics query to immediately surface latency degradation, prediction drift flags, and incoming data quality indexes without opening the browser.

Is it possible to track telemetry simultaneously for both local development and production environments? +

Absolutely. Arize enforces strict separation using Spaces and Environments. You can instruct your AI agent to query the list_environments tool, figure out the sandbox ID, and push manual test logs strictly to the sandbox scope during debugging sessions, keeping production metrics clean.

Connect to your AI in seconds.

List datasets

List environments

List evals

Arize AI with 10 Tools

Make your AI actually useful.

List Datasets

List Environments

List Evals

Get Dataset

Get Model

Ingest Log

Get Metrics

List Models

Run Eval

List Spaces

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Tracking model behavior used to be a multi-tab headache.

Get model status checks directly via `get_model`.

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Debugging a Production Drift Spike

Setting up a New Evaluation Benchmark

Capturing Live Inference Data

Auditing Model Readiness

The honest tradeoffs

Manual Dashboard Clicking

Forgetting Environment Context

Assuming Model Schema is Static

When It Fits, When It Doesn't

Questions you might have