CERN Open Data MCP for AI. Access 66,000+ Particle Physics Datasets Instantly

Q: How do I search for a specific experiment like ALICE using searchdatasets?

You combine searchdatasets with the 'experiment' filter. This lets you scope your full-text query specifically to data from that collaboration, giving you highly targeted results.

Q: I found a publication DOI; how do I get the data record using getrecordbydoi?

You pass the DOI directly to getrecordbydoi. This tool resolves the reference ID and returns the dataset's title, type, and direct link if one exists.

Q: What is the best way to find all available physics research topics?

Run listcategories first. It provides a master list of every major topic, like Exotica or B physics, along with an immediate count of datasets for each.

Q: Can I check if the data portal connection is working before querying?

Yes, run checkcernopendatastatus. This simple tool verifies the API connectivity and overall status of the entire CERN Open Data system.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

CERN Open Data connects your agent directly to over 66,000 particle physics datasets and research documents from the Large Hadron Collider.

You can query by experiment type, collision energy range, or specific theoretical concept like Dark Matter; it retrieves full metadata, file listings, and technical glossaries.

What your AI can do

Check cern opendata status

Verifies that the connection to the CERN Open Data Portal is active and operational.

Get glossary

Searches the official particle physics glossary for definitions of technical terms, components, or phenomena.

Get portal statistics

Retrieves high-level statistics on the entire data portal's scope, including record counts and available file formats.

+ 13 more capabilities included

Search by scientific parameters

Locate specific records using filters like collision energy (e.g., 13 TeV) or particle collision type (e+e-).

Retrieve detailed record metadata

Fetch complete details for any dataset, including authors' ORCID IDs and the DOI.

List all available experiments

Get a full count and list of major CERN collaborations like CMS, ATLAS, and ALICE.

Browse physics categories

Filter the data pool by broad research topics, such as Exotica or B physics.

Look up technical jargon definitions

Access a specialized glossary to define terms like pseudorapidity or luminosity for reports or presentations.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

CERN Open Data with 16 Tools

These tools allow you to query the full spectrum of particle physics data, from high-level statistics to individual file URIs.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using CERN Open Data on Vinkius

Check Cern Opendata Status

Verifies that the connection to the CERN Open Data Portal is active and operational.

Get Glossary

Searches the official particle physics glossary for definitions of technical terms...

Get Portal Statistics

Retrieves high-level statistics on the entire data portal's scope, including record...

Get Record By Doi

Finds the corresponding open data record when you provide a digital object...

Get Record

Fetches comprehensive metadata for a specific dataset ID, detailing authors...

List Categories

Lists all available physics research categories and their associated dataset counts.

List Experiments

Provides an inventory of active CERN collaborations, like CMS or ATLAS, along with the number of datasets each has published.

List Record Files

Lists every file associated with a specific dataset record, providing size and...

Search By Category

Searches the entire repository using physics research categories to narrow down the...

Search By Collision Energy

Filters datasets based on the specific collision energy used during the experiment...

Search By Collision Type

Narrows results by the particle interaction type, such as proton-proton (pp) or...

Search By Experiment

Focuses searches exclusively on data generated by one specific collaboration, like ALICE.

Search Datasets

Performs a broad search across all available fields using keywords plus multiple filters for maximum precision.

Search Documentation

Locates user guides, policies, and technical documentation related to the data or...

Search Software

Finds analysis frameworks, reconstruction tools, and specialized code used in...

Search Supplementaries

Retrieves technical context documents essential for reproducing published scientific...

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The CERN Open Data integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "cern-open-data": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the CERN Open Data tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"cern-open-data": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with CERN Open Data, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by CERN Open Data. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 16 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

The tedious part is compiling a bibliography that actually works.

Currently, if you need to reference a dataset for a paper—say, one related to Dark Matter searches—you must navigate through multiple academic portals. You find the abstract, then cross-reference the DOI on another page, and finally go to a third site just to see if the raw file links are listed. It’s a painful cycle of copy/pasting IDs across disparate web pages.

With this MCP, you ask your agent for the record using its identifier or the DOI. The system immediately pulls the complete metadata—the abstract, the authors' identifiers, and even a list of all associated files via `list_record_files`—all in one structured response. You get the data structure instantly.

The record details are now just one prompt away.

You no longer have to manually track down the specific file formats or check if a dataset is even associated with an experiment. Instead, you ask for the full metadata using `get_record`, and it returns everything: publication date, required collision parameters, and the file distribution summary.

The difference isn't just convenience; it’s rigor. You get verified data structure every time. It lets you focus on the physics, not the web interface.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

Need access to high-energy physics data? This MCP gives your agent direct read access to the CERN Open Data Portal, a massive repository of scientific research. Forget navigating complex web forms just to check an event count or find a specific analysis framework. You query for 'Higgs boson' or 'ATLAS experiment,' and you get metadata right back.

It’s designed for those who need raw data details—full abstracts, author ORCID identifiers, file URIs—without the clicks. Vinkius hosts this connection, making it available to any MCP-compatible client. Your agent instantly becomes a particle physics research assistant, giving you immediate access to datasets and documentation spanning decades of collision history.

Built · Hosted · Managed by Vinkius CERN Open Data MCP - Particle Physics Datasets

Server ID 019dea5e-810d-724a-a1b6-a359ceb7092c

Vinkius Inspector

Compliance Grade A+

Score 98.33/100

Report View Report ↗

What Changes When You Connect

Precision filtering saves time. Instead of browsing general results, you can narrow the search immediately by collision energy using search_by_collision_energy or particle type with search_by_collision_type.

Reproducibility is built in. Need to understand how a result was achieved? Use get_record for full metadata or run list_record_files to see the exact files available for analysis.

No jargon left unexplained. The dedicated get_glossary tool lets you define obscure physics terms instantly, which is critical when writing technical reports.

The scope is visible upfront. Before deep diving, use list_experiments to understand the sheer volume and variety of data contributed by major collaborations like CMS (52k datasets).

Full traceability means confidence. If you have a publication DOI, run get_record_by_doi. It resolves that reference directly into an open dataset record, skipping manual searches.

Beyond just numbers: Use search_supplementaries to find the technical configuration details and guides necessary to actually replicate published research.

See it in action

01 01

Tracking historical data gaps

The user knows they need to compare LEP era results with modern LHC runs. They first use list_experiments to confirm DELPHI and CMS exist, then combine search_by_collision_type (e+e- for DELPHI; pp for CMS) with get_portal_statistics to gauge the historical scope of available data.

02 02

Recreating a complex analysis

A researcher finds an abstract but needs the underlying files. They use get_record_by_doi first, then run list_record_files to get file URIs and checksums, finally checking search_supplementaries for the specific analysis configuration needed.

03 03

Understanding a niche term

The user encounters 'pseudorapidity' in an article. They immediately use the get_glossary tool to get a precise definition, ensuring their report is technically accurate before proceeding with dataset queries.

04 04

Finding analysis code for a specific topic

A student wants to build a model for Dark Matter. They use search_by_category and filter by 'Exotica,' then run search_software to find the appropriate reconstruction frameworks before they even touch the raw data.

The honest tradeoffs

Treating it like a general search engine

Anti-pattern

The user simply types 'Higgs boson' into a generic search field, getting thousands of unrelated documents and datasets mixed together.

The Fix

Start by using search_datasets with the keyword, but always combine that with filters. For instance, filter by both 'physics category: Higgs Physics' AND 'collision energy: 13 TeV'. This provides actionable results.

Assuming a DOI is enough

Anti-pattern

The user has an old publication reference and assumes the dataset record exists simply because they have the DOI, but doesn't know if it's linked.

The Fix

First, use get_record_by_doi to check for direct linkage. If that fails, broaden the search using search_datasets, combining the publication year and keywords from the reference.

Needing files without knowing the record

Anti-pattern

The user finds an abstract but doesn't know which specific dataset ID (the 'record') it belongs to, so they can't get file links.

The Fix

Use get_record after finding a promising candidate via search_datasets. This pulls the record metadata and provides the necessary IDs needed before calling list_record_files.

When It Fits, When It Doesn't

Use this MCP if you need to prove where scientific data comes from or how it was analyzed. Don't use it if your goal is merely general information retrieval, like 'What is particle physics?' — the get_glossary tool handles that better. If you know the exact experiment and energy range, run search_by_experiment combined with search_by_collision_energy. If you only have a keyword (e.g., 'Heavy Fermions'), start with search_datasets using text queries combined with filters like list_categories to define the scope first. Never skip checking the get_portal_statistics endpoint; it gives you immediate context on the sheer scale of the data available.

Questions you might have

How do I search for a specific experiment like ALICE using search_datasets? +

You combine search_datasets with the 'experiment' filter. This lets you scope your full-text query specifically to data from that collaboration, giving you highly targeted results.

I found a publication DOI; how do I get the data record using get_record_by_doi? +

You pass the DOI directly to get_record_by_doi. This tool resolves the reference ID and returns the dataset's title, type, and direct link if one exists.

What is the best way to find all available physics research topics? +

Run list_categories first. It provides a master list of every major topic, like Exotica or B physics, along with an immediate count of datasets for each.

Can I check if the data portal connection is working before querying? +

Yes, run check_cern_opendata_status. This simple tool verifies the API connectivity and overall status of the entire CERN Open Data system.

I want to know what specific files are inside a record; how do I use list_record_files? +

It returns the filename, size in bytes, checksum, and direct data URI for every file linked to that dataset. This tool is essential because it lets you verify exactly what you'll download before pulling large datasets into your analysis.

How do I find out the overall scope of all available physics data using get_portal_statistics? +

It provides comprehensive statistics across every facet: record types, years, keywords, and event count distributions. This is the best way to gauge the total volume and composition of the entire CERN dataset repository.

I need instructions on how to use a specific dataset or understand detector setups; should I use search_documentation? +

Yes, it searches for guides, policies, and documentation. You'll find titles and abstracts that point you toward usage instructions, detector configurations, or data processing workflows needed for reproduction.

I know the specific collision energy I need; how does search_by_collision_energy help me scope my results? +

It filters datasets based on established collision energies (like 13TeV or 7TeV). This lets you quickly narrow down millions of records to only those matching your precise experimental conditions.

Do I need an API key to use this server? +

No. The CERN Open Data Portal API is completely public and requires no authentication. Simply subscribe to this server and enter any placeholder value in the API key field to start querying particle physics datasets immediately.

What kind of data can I access from CERN? +

You can access over 66,000 datasets from major LHC experiments (CMS, ATLAS, ALICE, LHCb) and legacy experiments (DELPHI, OPERA). This includes real collision data, Monte Carlo simulations, derived datasets, analysis software, physics glossary entries, and detailed documentation. Data covers Higgs boson searches, Dark Matter studies, exotic particle searches, heavy-ion physics, and more.

Can I use CERN data for machine learning projects? +

Absolutely. CERN provides labeled datasets specifically designed for ML applications, including particle identification, jet classification, event reconstruction, and anomaly detection. Use the search tools with queries like 'machine learning' or filter by file type 'csv' or 'nanoaodsim' to find ML-ready formats. The CMS experiment alone has published thousands of simulated datasets with known physics labels.

Connect to your AI in seconds.

Check cern opendata status

Get glossary

Get portal statistics

CERN Open Data with 16 Tools

Make your AI actually useful.

Check Cern Opendata Status

Get Glossary

Get Portal Statistics

Get Record By Doi

Get Record

List Categories

List Experiments

List Record Files

Search By Category

Search By Collision Energy

Search By Collision Type

Search By Experiment

Search Datasets

Search Documentation

Search Software

Search Supplementaries

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

The tedious part is compiling a bibliography that actually works.

The record details are now just one prompt away.

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Tracking historical data gaps

Recreating a complex analysis

Understanding a niche term

Finding analysis code for a specific topic

The honest tradeoffs

Treating it like a general search engine

Assuming a DOI is enough

Needing files without knowing the record

When It Fits, When It Doesn't

Questions you might have