# data.world MCP for AI Agents MCP

> data.world connects your AI agent directly to an enterprise data catalog, letting you discover and govern organizational data assets through conversation. You can search across all available datasets and projects, retrieve detailed metadata, track project progress, and list saved SQL or SPARQL queries without ever leaving your chat interface.

## Overview
- **Category:** knowledge-management
- **Price:** Free
- **Tags:** data-catalog, metadata-management, data-governance, collaborative-data, data-discovery, sql-queries

## Description

Need to find a dataset but don't know where it lives? This MCP connects your AI agent straight into data.world, the enterprise catalog platform. You stop clicking through dozens of dashboards just to check if that metric exists. Instead, you ask your agent directly: 'What datasets track global emissions?' The agent searches the entire index for relevant assets and projects across your organization’s data landscape.

It's more than just searching; it lets you manage governance too. You can review project status, list who owns an asset, or check historical activity logs to see when a dataset was last updated. When working with Vinkius, connecting this MCP gives your agent instant access to the full data catalog, making data discovery part of the conversation itself.

It means you spend zero time on manual metadata gathering and all your time analyzing what the data actually says.

## Tools

### get_dataset_details
Retrieves comprehensive details for one dataset, including its field definitions, tags, and license information.

### get_my_profile
Fetches metadata about the currently authenticated user, showing your display name and account permissions.

### get_project_details
Gets detailed information on a specific project, including its members, objective, and linked resources status.

### list_recent_activity
Returns a log of recent actions across your data.world account, such as dataset updates or new contributions.

### list_my_collections
Lists all organizational collections you manage, providing their descriptions and item counts.

### list_dataset_queries
Generates a list of saved SQL or SPARQL query definitions associated with a specific dataset.

### list_my_datasets
Provides a catalog listing of all datasets owned by your account, showing visibility and last modification dates.

### list_project_insights
Lists documented findings, visualizations, or insights that have been published within a specific project workspace.

### list_my_projects
Returns a summary of all data projects you own, including the linked datasets and your role (owner/contributor).

### search_catalog
Searches across the entire data.world platform for relevant datasets or projects using full-text search.

## Prompt Examples

**Prompt:** 
```
I need to analyze the scope for our new climate model. Can you list all relevant datasets?
```

**Response:** 
```
**Climate Data Assets Found:**

*   **NASA GISS Surface Temperature** (Tags: Climate, Global)
    *   *Details:* Tracks global temperature anomalies. Licensed under CC BY 4.0.
*   **UN Emissions Stats** (Tags: Energy, Emissions)
    *   *Details:* Annual CO2 emissions by sector. Last updated: 2024-10-01

Do you want the specific field definitions for either of these?

```

**Prompt:** 
```
What was the last activity on 'Marketing Analytics' project?
```

**Response:** 
```
Here’s a quick look at recent changes for that project:

*   **2026-06-11:** User Jane Doe updated the `Customer_Churn` dataset definition. (Type: Dataset Update)
*   **2026-06-10:** System added 'Q3 Report' to the project’s collection.
*   **2026-05-28:** John Smith uploaded a new insight on regional spending patterns. (Type: Insight Added)

Looks like activity is steady!
```

**Prompt:** 
```
List the saved queries for our sales data.
```

**Response:** 
```
I found 3 saved query definitions attached to the `Sales Transactions` dataset:

1.  **Monthly Revenue Recap:** (Language: SQL) Calculates total revenue grouped by month.
2.  **Top Performing Regions:** (Language: SPARQL) Identifies regions with highest sales volume.
3.  **Q4 Churn Analysis:** (Language: SQL) Determines customer churn rates for the last quarter.

Let me know if you want to see the full definition for any of these.
```

## Capabilities

### Search the entire data catalog
Find datasets or projects by searching titles, descriptions, or tags across the whole organization’s assets.

### Retrieve detailed dataset metadata
Get specific details for any asset, including field definitions, associated tags, and licensing information.

### List owned datasets and projects
Generate lists of all data assets or active projects you manage on the platform.

### Access saved queries and insights
List documented findings, visualizations, or specific SQL/SPARQL query definitions linked to a project or dataset.

### Monitor data governance activity
Review recent platform activity logs, including updates to datasets or changes in collection membership.

## Use Cases

### Figuring out who owns that weird dataset
A data steward needs to audit an asset. Instead of guessing which department owns it, they ask their agent to run `get_project_details` or check the resource owners via the API call for project information.

### Comparing multiple datasets across projects
A data scientist wants to see if 'Sales' data is tracked in three different places. They ask their agent to use `search_catalog` and then run `list_my_datasets` to compare the metadata of all available versions.

### Building a report based on old findings
A knowledge manager needs to understand historical data patterns. They prompt their agent to use `list_project_insights` for a specific project, instantly surfacing documented findings from previous teams.

## Benefits

- Instead of digging through UIs, you can use the `search_catalog` tool to instantly find data assets by title or tag. Your agent handles the index search so you don't.
- The MCP lets your agent get detailed metadata using `get_dataset_details`. You immediately see field definitions and licensing info without navigating to a separate asset page.
- Need to know project status? Calling `list_project_insights` means your AI client pulls all documented findings right into the chat, keeping your context clear.
- You can track governance history by running `list_recent_activity`. This gives you an immediate audit trail of who changed what and when.
- The agent supports structured queries. You don't just get a list; using `list_dataset_queries` provides the actual SQL or SPARQL code definitions, ready for review.

## How It Works

The bottom line is that your AI client uses this MCP to treat data discovery and governance tasks as simple conversations, not multi-step UI workflows.

1. Connect the data.world MCP to your AI client and authorize it using your API token.
2. Tell your agent what you need—for example, 'Show me all datasets related to Q3 sales' or 'What is the status of Project Phoenix?'
3. The agent executes the necessary lookup tools, returning structured metadata like field definitions, project members, and asset lists directly in the chat.

## Frequently Asked Questions

**How does the data.world MCP help me find specific datasets?**
The data.world MCP allows your agent to search across all assets using full-text search, letting you pinpoint exactly which dataset exists without knowing its exact location or owner. It’s like having a map of every piece of data in the company.

**Can I use this MCP to check project status and ownership?**
Yes. The agent can run tools to get detailed information on any specific project, showing who is a member, what the objective is, and which resources are linked. This saves you from having to open multiple dashboards.

**Is this data.world MCP good for data governance?**
It's excellent for governance because it lets your agent list recent activity logs, showing who modified an asset or project and when. You get a clear audit trail without doing manual checks.

**What if I need to see the code used in saved queries?**
The MCP can list all saved SQL or SPARQL query definitions for any dataset you specify. It retrieves the actual language and metadata, so you know exactly how the data is being processed.

**Do I need to be a data scientist to use data.world with this MCP?**
No. While it's powerful for analysts, any role that needs to find or manage corporate information can benefit. It simplifies the process of discovery and validation regardless of your job title.