# Datos.gob.es Catalog MCP

> Datos.gob.es (Catálogo Nacional) provides access to Spain's entire repository of public sector data. Connect your AI client to search, filter, and analyze thousands of open datasets from national, regional, and local governments. You can narrow results by theme, specific publisher, geographic area, or file format.

## Overview
- **Category:** knowledge-management
- **Price:** Free
- **Tags:** open-data, spain, public-sector, datasets, transparency

## Description

This MCP lets you query the official Datos.gob.es API directly through your AI agent. Instead of navigating dozens of government websites—each with its own search system—you ask for what you need and get structured metadata back. You can find datasets across national, regional, and local administrations by filtering on everything from health or economy themes to specific provinces or publishers. Whether you're looking at raw data distributions or just checking which organizations publish certain records, this MCP handles the complexity. By connecting through Vinkius, your AI client accesses the full catalog, meaning you don't have to worry about knowing where to start searching. You simply ask for a dataset—say, 'all pollution reports in Andalusia formatted as CSV'—and get a targeted list of candidates.

## Tools

### get_country_spain
Pulls general information about Spain at the national level.

### get_dataset
Retrieves details for a specific dataset using its unique URI identifier.

### get_province
Gets information about one specific province by ID.

### get_public_sector
Retrieves data for a single primary sector using its ID.

### get_region
Gets information about a specific Autonomous Community (Region) by ID.

### list_datasets_by_date
Filters and lists datasets based on a specified modification date range.

### list_datasets_by_format
Limits the dataset list to records available in a specific file format (like CSV or JSON).

### list_datasets_by_keyword
Filters the entire catalog by general keywords provided by the user.

### list_datasets_by_publisher
Narrows down results to only those datasets published by a specific organizational ID.

### list_datasets_by_spatial
Restricts the search scope to include data tied to certain geographic areas (e.g., Valencia and Madrid).

### list_datasets_by_theme
Filters datasets by a specific category or overarching theme, like 'Health' or 'Economy'.

### list_datasets
Retrieves a complete list of all available datasets in the national catalog.

### list_distributions_by_dataset
Lists the specific file format options for an already selected dataset ID.

### list_distributions_by_format
Filters distributions to show only those available in a certain format, like CSV.

### list_distributions
Gets a list of all available data distributions and formats.

### list_provinces
Pulls a complete list of all provinces within Spain.

### list_public_sectors
Provides the full taxonomy and list of primary economic sectors used by publishers.

### list_publishers
Retrieves a comprehensive list of every organization that has published data in the catalog.

### list_regions
Gets an exhaustive list of all Autonomous Communities (Regions) in Spain.

### list_spatial_options
Retrieves all possible geographic coverage boundaries for filtering data.

### list_themes
Lists every available category or theme used to classify datasets in the catalog.

### search_datasets_by_title
Searches the national catalog specifically by matching keywords against dataset titles.

## Prompt Examples

**Prompt:** 
```
Search for datasets related to 'transporte' in the Spanish open data catalog.
```

**Response:** 
```
I've found several datasets related to 'transporte'. Notable entries include 'Transporte Terrestre de Viajeros' and 'Red de Carreteras'. Would you like the full details for any of these?
```

**Prompt:** 
```
List all datasets that are available in CSV format.
```

**Response:** 
```
I am retrieving datasets with CSV distributions... I've found a list of datasets including 'Censo de locales' and 'Presupuestos municipales'. I can provide the download links if you're interested.
```

**Prompt:** 
```
Show me the themes or categories available in the Datos.gob.es catalog.
```

**Response:** 
```
The catalog is organized into several themes such as 'Hacienda', 'Cultura y Ocio', 'Medio Ambiente', and 'Salud'. Which category would you like to explore?
```

## Capabilities

### Search datasets by keyword or title
Find relevant open data records across the national catalog using general search terms.

### Filter data by specific metadata criteria
Narrow down results based on themes (e.g., education), publishers, dates, or file formats like JSON and CSV.

### Analyze geographical scope
Limit your search to a precise area, whether it's an entire region, a specific province, or the whole country of Spain.

### Inspect data distributions and metadata
Get detailed information about any dataset, including its update frequency, issuing body, and available file types.

### List catalog taxonomy elements
Retrieve lists of available themes, regions, provinces, or publishers to guide complex filtering queries.

## Use Cases

### Comparing regional spending on education
A policy researcher needs to compare educational budgets across multiple regions. Instead of visiting each region's site, they can use `list_datasets_by_theme` (filtering for 'Education') and then combine that search with the results from `list_regions` to get a unified list of candidate datasets.

### Finding all raw transport data by format
A developer needs all available transportation data, but only in JSON for their application. They can use `list_datasets_by_keyword` ('transport') followed immediately by `list_datasets_by_format` (JSON) to build a targeted list of endpoints.

### Checking data availability before writing code
A software engineer needs to know the exact structure and update cycle for unemployment figures. They use `get_dataset` with a known URI identifier, checking the returned metadata before writing any integration code.

### Narrowing down complex datasets by geography
A student only cares about data from the Basque Country concerning tourism. They combine `list_datasets_by_spatial` (filtering for 'País Vasco') with a keyword search to isolate the most relevant records.

## Benefits

- You don't waste time browsing government sites. By using `list_datasets_by_theme`, you immediately focus on the correct topic, skipping hundreds of irrelevant results.
- Pinpoint your search with geographical filters. Need health stats for Madrid only? Use `list_provinces` and then filter by that specific ID to cut down noise instantly.
- Developers gain speed. Instead of guessing API endpoints, you can use `get_dataset` once you have the URI identifier, getting all the necessary metadata in one call.
- Data format limitations disappear. If you know your final output needs to be CSV, you filter early using `list_datasets_by_format`, saving downstream processing steps.
- The scope is always clear. Use `list_publishers` and `list_public_sectors` together to understand who generated the data, giving context to the figures themselves.

## How It Works

The bottom line is that you tell your AI client what data point you're after, and it handles the complex filtering across Spain's public sector databases.

1. Subscribe to this MCP on Vinkius. This establishes the connection between your preferred AI client and the Datos.gob.es API.
2. Tell your agent exactly what you need—for instance, 'Find datasets about waste management in Valencia.'
3. The MCP executes the necessary filters (theme, region) against the national catalog and returns a list of matching dataset details.

## Frequently Asked Questions

**How do I find all datasets related to 'water management'? (list_datasets_by_keyword)**
You can use `list_datasets_by_keyword` for a broad search. This gives you initial candidates, but it's better practice to also run `list_datasets_by_theme` to ensure the results fall under the correct official category.

**Can I filter data by multiple regions? (list_spatial_options)**
Yes. You use `list_spatial_options` to confirm the available boundaries, and then combine that knowledge when calling `list_datasets_by_spatial`. This lets you look at multi-regional comparisons.

**What if I only know the publisher's name? (list_publishers)**
First, run `list_publishers` to get the exact ID. Then, use that precise ID in `list_datasets_by_publisher`. Using the ID is much more reliable than using a name.

**Does this MCP handle data for private companies? (get_public_sector)**
No. This catalog focuses exclusively on public sector information from national, regional, and local administrations in Spain.

**How do I use `list_datasets_by_format` to filter for specific file types like CSV or JSON?**
You pass the desired format (e.g., 'csv' or 'json') directly into the function call. This immediately filters the entire catalog, so you only get datasets containing that specific data type. It’s a fast way to narrow down your search results based on how you plan to consume the data.

**I need data updated recently; can I use `list_datasets_by_date`?**
Absolutely, you specify a start and end date range when calling `list_datasets_by_date`. The MCP filters datasets that were modified within those parameters. This is crucial for research where knowing the update frequency matters.

**What's the best way to get detailed metadata if I already have a dataset’s URI, using `get_dataset`?**
Simply pass the full URI identifier into `get_dataset`. This tool bypasses general searching and retrieves all core information about that single dataset. You'll get details like its issuing body, update frequency, and available distributions.

**Before I filter geographically, how do I list all provinces using `list_provinces`?**
Run the `list_provinces` tool first. This gives you a definitive taxonomy of every province available in the catalog. After getting that list, you can use the specific IDs to filter your datasets by location.

**How can I search for datasets containing a specific word in the title?**
Use the `search_datasets_by_title` tool. Provide the string you are looking for in the `title` parameter, and the agent will return all matching datasets from the national catalog.

**Is it possible to filter datasets by a specific file format like CSV or JSON?**
Yes! Use the `list_datasets_by_format` tool and specify the format (e.g., 'csv', 'json', 'xlsx'). This will retrieve only the datasets that offer distributions in that format.

**Can I find datasets belonging to a specific category like 'Health' or 'Environment'?**
Absolutely. Use the `list_datasets_by_theme` tool with the corresponding theme ID. You can first use `list_themes` to see all available categories in the catalog.