# OpenDataSUS MCP

> OpenDataSUS connects your AI client directly to Brazil’s official public health data (SUS) portal. It lets you search, filter, and pull actual rows from massive datasets—like COVID-19 vaccination records or epidemiological stats—without having to download a single file. Use its tools to explore everything the Ministry of Health has published in natural language.

## Overview
- **Category:** data-analytics
- **Price:** Free
- **Tags:** public-health, brazil-health, dataset-discovery, health-statistics, open-data, ckan

## Description

OpenDataSUS connects your AI client straight into Brazil’s official public health data (SUS) portal. You get direct access to massive datasets—think COVID-19 vaccination records or deep epidemiological stats—and you never have to download a single file. Your agent uses these tools to explore everything the Ministry of Health has published, right in your chat window.

### Getting Started: Discovering What’s Available

You gotta know what data exists before you can use it. You start by figuring out the scope. Use `package_list` when you need a complete catalog; this dumps every single dataset name available across the entire OpenDataSUS portal. If that list is too big, you narrow your focus first. You can check which organizations provided the info using `organization_list`, listing official departments like the Ministério da Saúde. Or, if you're looking for a general subject—say, 'vaccinations' or 'COVID-19'—you run `tag_list` to pull all available keywords and tags that help scope your topic.

If you know what you’re looking for, but not the exact name, use `package_search`. You feed it keywords or criteria, and it filters down the dataset list. Once you have a potential package, you run `package_show` to pull detailed metadata on that specific dataset—you'll see its stated purpose and what resources it contains. This is your high-level overview.

### Deep Diving Into Data Structure

Knowing a dataset exists isn’t enough; you gotta know what columns it has. If `package_show` points to a resource file (like a CSV), use `resource_show`. This tool gets the technical metadata for that individual data file, detailing every column name and its format before your agent queries it. It's how you confirm if a field is a date, an integer, or a string.

### Querying the Raw Data

This is where the magic happens. You use `datastore_search` to query the actual content of a resource. Instead of just getting metadata, this tool pulls filtered table data directly into your conversation. You tell your agent exactly which columns and what rows you need, and it returns the raw data—the full result set—ready for analysis right in the chat. This capability lets you treat the dataset like an active database connection.

### The Workflow Summary

Your typical workflow runs through these steps:
1. **Scope:** You run `tag_list` or `organization_list` to narrow down general topics or providers. 
2. **Search/Filter:** You use `package_search` to pinpoint the right dataset.
3. **Inspect:** You run `package_show` to understand the package’s scope, and then `resource_show` to confirm the technical schema (columns and data types) of the underlying file.
4. **Extract:** Finally, you execute `datastore_search`, telling the agent precisely what rows and columns to pull into your conversation. You never leave this platform; the raw data comes straight through.

## Tools

### datastore_search
Filters and retrieves specific rows of data from a given resource.

### group_list
Lists all high-level categories used to group datasets on the portal.

### organization_list
Provides a list of official departments or organizations that publish data.

### package_list
Lists every single dataset package available across the OpenDataSUS portal.

### package_search
Searches for specific datasets by name or description criteria.

### package_show
Retrieves detailed metadata about a chosen dataset, including its purpose and resources.

### resource_show
Gets the technical metadata for an individual data file (like CSV), detailing columns and format.

### tag_list
Lists all keywords or tags used across datasets, helping you scope a general topic.

## Prompt Examples

**Prompt:** 
```
Search for datasets related to 'vacinacao covid' in OpenDataSUS.
```

**Response:** 
```
I found several datasets. The most relevant is 'Campanha Nacional de Vacinação contra Covid-19'. Would you like to see the available resources (CSV/Excel) for this package?
```

**Prompt:** 
```
List all health organizations providing data on the portal.
```

**Response:** 
```
I've retrieved the list of organizations. It includes 'Ministério da Saúde', 'Secretaria de Vigilância em Saúde', and others. Which one would you like to explore?
```

**Prompt:** 
```
Query the first 5 rows of the resource with ID 'd3848184-5077-4667-835d-591d67641bb9'.
```

**Response:** 
```
Accessing the DataStore... Here are the first 5 records from that resource, showing columns like 'municipio', 'data_notificacao', and 'casos_confirmados'.
```

## Capabilities

### Discover all available datasets
Use `package_list` to get a complete catalog of every dataset name on the OpenDataSUS portal.

### Search for specific data packages
Filter and search for datasets using keywords or criteria with `package_search`.

### Identify data sources and providers
List all official organizations that provide health data, like the Ministério da Saúde, using `organization_list`.

### Retrieve full dataset schema
Get detailed metadata for a specific package or resource using `package_show` or `resource_show`, showing its structure and provenance.

### Filter and retrieve raw data rows
Use `datastore_search` to query the actual content of a resource, pulling filtered table data directly into your conversation.

## Use Cases

### Tracking Vaccine Coverage Changes
A public health analyst needs to see how vaccination rates changed in 2021. They start by running `tag_list` for 'vacinação' and then use `package_search` to find the right package. Finally, they tell their agent: 'Query the first 5 rows of that resource using datastore_search,' getting immediate proof of concept data.

### Comparing Data Sources
A developer needs metrics from both the Ministry of Health and a specific state secretariat. They use `organization_list` to pull all available providers, ensuring they build their application on two distinct, verified data sources rather than just one.

### Verifying Data Columns
A researcher finds a dataset but isn't sure if it has the 'age group' column. Instead of guessing, they use `resource_show` on the resource ID to verify the schema first. This prevents them from running an empty query and saves hours of debugging.

### Comprehensive Topic Review
A student is writing a paper on general epidemiology. They run `package_list`, see ten potential datasets, then use `group_list` to filter those packages by 'Infectious Disease' before selecting the best one for deeper analysis.

## Benefits

- Stop sifting through dozens of departmental pages. Using `tag_list` lets you scope your research by a general topic (like 'COVID-19') first, then narrow it down with `package_search`. It’s pure discovery.
- Don't guess if the data is clean or formatted right. Run `resource_show` before querying to see the exact columns and structure of a dataset file, saving you time and headaches.
- Skip manual downloads entirely. Once your agent knows which resource package you need, it uses `datastore_search` to pull filtered tables straight into your chat window for immediate analysis.
- Pinpoint exactly who published the data. Use `organization_list` to ensure the statistics you're citing came from an official source like the Ministério da Saúde.
- Validate your whole path with `package_show`. This tool gives you the full context and metadata for a dataset, confirming its scope and original purpose before you write a single query.

## How It Works

The bottom line is you don't navigate the portal; your AI client runs the necessary tools to find, validate, and pull the data for you in a sequence of steps.

1. Subscribe to OpenDataSUS and (optionally) enter an API Key. Then, use `tag_list` or `organization_list` to scope the general area of data you're interested in.
2. Use `package_search` to narrow down that scope by topic name, then run `package_show` on the best match to confirm the dataset's schema and structure.
3. Finally, feed the required criteria (columns/rows) into `datastore_search`. The agent executes the query against the official API and returns the raw data payload.

## Frequently Asked Questions

**How do I find all possible datasets using package_list?**
You run `package_list`. This tool provides a comprehensive list of every dataset name available on the portal. It's your starting point for seeing what data exists.

**What is the difference between package_search and datastore_search?**
`package_search` finds the *dataset* (the container). `datastore_search` runs the query against the *actual rows of data* inside that dataset, pulling out the results you need.

**Can I check a resource's columns using resource_show?**
Yes. Use `resource_show` to get technical metadata for any specific file linked to a package. This tells you exactly what columns (like 'municipio') and data types are available.

**Does OpenDataSUS cover all Brazilian health data?**
No, it covers the public datasets published on the official OpenDataSUS portal from the Ministry of Health. It won't access private or non-published departmental records.

**What happens to my query limits when I use `datastore_search` without an API key?**
You are limited by the default rate caps set by OpenDataSUS. Using an API key bypasses these standard restrictions, letting you run larger, more complex data pulls reliably.

**If I need to know who provided a dataset, should I use `organization_list` first?**
Yes, `organization_list` provides a definitive list of all data providers. You can identify the source organization before running any searches or analyzing specific packages.

**What detailed information do I get when I run `package_show`?**
`package_show` delivers the full metadata package for a dataset. This includes provenance, licensing terms, and structural details—more than just a simple description.

**When using `datastore_search`, what are the key parameters I can filter by?**
You can apply filters for date ranges, specific geographic codes, or column values directly in your query. This makes data retrieval highly targeted and efficient.

**How can I search for specific rows inside a large CSV dataset?**
You can use the `datastore_search` tool. Provide the `resource_id` and use the `q` parameter for full-text search or the `filters` parameter to target specific columns.

**Can I find which organizations provide the most datasets?**
Yes! Use the `organization_list` tool to see all data providers registered in the OpenDataSUS portal.

**How do I get the download link for a specific data file?**
Use the `resource_show` tool with the Resource UUID. It will return the metadata including the URL where the file is hosted.