# The Guardian MCP

> The Guardian MCP Server gives your AI agent structured access to The Guardian's entire content archive. It lets you run full-text queries across decades of reporting, filter by section or tag, and retrieve the full metadata for any article—all via natural conversation. Stop browsing; start querying.

## Overview
- **Category:** knowledge-management
- **Price:** Free
- **Tags:** journalism, content-archive, news-api, article-search, media-intelligence, data-retrieval

## Description

The Guardian MCP Server gives your AI agent deep access to decades of reporting across The Guardian’s entire content archive. You're not just browsing; you're running full-text queries against a massive, structured database of journalism. This server lets your agent pull everything—from the raw body text and byline details for any specific article to mapping out every contributor and topic area they’ve ever covered.

You can start by figuring out the scope. If you want to know what content exists, you'll first use `list_sections` to grab a list of all major editorial categories—things like 'Technology,' 'Sport,' or 'Politics.' You'd then run `list_tags` to pull every single keyword or contributor tag they’ve used over the years. To map out the entire scope, you can also call `list_editions`, which lists regional versions of The Guardian, so you know if you need US data, UK data, or Australian feeds.

When you're ready to search, you get powerful filtering options. If you only care about one area, calling `search_by_section` lets your agent browse content strictly filtered by a chosen category, and it handles pagination for you. You can narrow the focus even more using `search_by_tag`, which filters results based on specific keywords or named contributors.

For ultimate control, you've got `search_content`. This tool runs across every possible criteria at once: section name, keyword tag, start and end date range, ordering preference, and pagination. You don't have to run three separate searches; this one query does it all.

If your search is time-bound, you can use `search_by_date_range` to pull articles published between specific dates you provide. Or, if you know the exact section *and* the date range, you've got multiple ways to combine those filters for super precise results.

Once a query returns a list of promising article IDs, getting the full details is easy. You use `get_item` to fetch the complete body text, the byline, and all the necessary metadata for one single article ID. If you just want a snapshot of what’s hot right now, calling `get_latest_content` returns a list ordered from most recent straight back.

To find out about featured content within any given category, you can check `get_section_details`, which pulls the editorial highlights and most-viewed pieces for that specific section. It’s all structured so your AI agent gets clean data it can work with immediately.

## Tools

### get_item
Fetches the complete body, byline, and metadata for one specific Guardian article.

### get_latest_content
Retrieves a list of articles ordered from most recent to oldest.

### get_section_details
Gets editorial highlights and the most-viewed content for any given section.

### list_editions
Lists all available regional versions of The Guardian (e.g., US, UK, Australia).

### list_sections
Lists every major editorial section available on the platform, like 'Technology' or 'Sport'.

### list_tags
Provides a list of all content tags used by The Guardian (e.g., keyword, contributor, tone).

### search_by_date_range
Searches the entire archive for articles published within user-specified start and end dates.

### search_by_section
Browses content filtered specifically to a chosen editorial section, supporting pagination.

### search_by_tag
Filters and searches the archive using specific keywords or contributor tags.

### search_content
Runs a powerful search across all criteria: section, tag, date range, ordering, and pagination.

## Prompt Examples

**Prompt:** 
```
Search The Guardian for recent articles about artificial intelligence in the technology section.
```

**Response:** 
```
I found 10 recent articles in the Technology section matching 'artificial intelligence'. The most recent is 'AI reshapes the newsroom: how machine learning is transforming editorial workflows' published today by Alex Hern. Would you like to read the full article?
```

**Prompt:** 
```
What are the main editorial sections available on The Guardian?
```

**Response:** 
```
The Guardian is organized into 48 editorial sections. The major ones include: World News, UK News, Politics, Business, Technology, Science, Sport, Culture, Opinion, and Environment. Would you like to browse the latest articles from any of these sections?
```

**Prompt:** 
```
Find all Guardian articles about climate change published between January and March 2026.
```

**Response:** 
```
I retrieved 47 articles mentioning 'climate change' published between 2026-01-01 and 2026-03-31. Key pieces include investigative reports on carbon markets, policy analysis from COP summits, and feature stories on climate migration. The most discussed article had 312 comments. Shall I fetch the full text of any specific article?
```

## Capabilities

### Deep Content Retrieval
The server pulls the full text and metadata for any specific article by providing its identifier.

### Structured Search Queries
You can run advanced searches across the archive, filtering results simultaneously by section name, keyword tag, or precise date range.

### Content Discovery and Taxonomy Mapping
The system lists all available editorial sections and tags. This lets your agent map out the full scope of content coverage (e.g., listing every contributor or topic area).

### Regional Content Scope
You can query articles across multiple regional editions, including UK, US, Australia, and International feeds.

## Use Cases

### Tracking Policy Shifts
A policy expert needs to see how 'carbon market' coverage evolved between 2018 and 2024. They use `search_by_date_range` with the keyword tag, retrieving a chronological list of articles that proves shifts in editorial focus.

### Verifying Source Material
A journalist is writing about an old event and needs confirmation on what was reported. They use `list_sections` to find the relevant section (e.g., 'World') and then run `get_latest_content` or a targeted search to pull historical source material.

### Building Knowledge Graphs
A data science team wants to index all articles mentioning 'artificial intelligence' from the Technology section. They use `list_sections`, then run `search_by_section` and filter results with `search_by_tag` to build a complete, structured knowledge base.

### Competitive Media Analysis
A marketing firm wants to know what content is currently gaining attention. They use `get_section_details` to find the most-viewed or featured content across major sections and pull those titles for a competitive report.

## Benefits

- **Deep Trend Analysis:** Use `search_by_date_range` and `search_content` to track how the coverage of a topic changes over years, which is impossible by just browsing the homepage.
- **Contextual Depth:** Instead of getting surface links, use `get_item` to pull the full article text, metadata, and byline for immediate context within your workflow.
- **Pinpoint Precision:** Need to know if a story was about AI *and* featured a specific author? Use `search_by_tag` combined with `list_tags` to narrow results down immediately. This is better than a general search.
- **Schema Mapping:** Start by running `list_sections` and `list_editions`. This gives your agent the full vocabulary of content, ensuring it doesn't miss relevant areas or regional perspectives.
- **Multi-Layer Querying:** The combination of tools—starting with `search_by_section`, then filtering that result using `search_by_tag`—allows for highly precise, multi-step data retrieval.

## How It Works

The bottom line is: your agent handles complex journalistic queries using specific APIs instead of just reading a web page.

1. Subscribe to the server. Then, grab a free API key from The Guardian Open Platform.
2. Your AI client sends a query—for example, asking for all articles about 'AI' published in Technology between two dates.
3. The server executes the necessary search tools and returns structured JSON data containing article summaries, full texts, or metadata.

## Frequently Asked Questions

**Can I retrieve the full text of a Guardian article, not just the headline?**
Yes. Use the `get_item` tool with the article's path ID. The response includes the full body text, byline, standfirst, publication date, section, tags, and thumbnail image when available.

**How far back does the Guardian Content API archive go?**
The Guardian Content API provides access to articles dating back to 1999. You can use `search_by_date_range` with specific start and end dates to query historical content from any period covered by the archive.

**Is a paid subscription required to use this integration?**
No. The Guardian Open Platform offers a free developer API key that supports up to 12 calls per second and 5,000 calls per day. This is sufficient for most research and automation workflows.

**Can I filter articles by topic, section, or contributor?**
Yes. The `search_content` tool accepts section, tag, and date filters. Use `list_sections` to discover available sections and `list_tags` to find keywords, contributors, and series to refine your queries.

**When I use the `get_item` tool, what metadata fields are returned with the full article text?**
You get the complete body text along with critical context: the byline, publication date, and specific editorial metadata. This structured data lets your agent build knowledge graphs or index content without needing extra calls.

**What happens if I run too many searches using `search_content` in a short time?**
The system adheres to The Guardian Open Platform's rate limits. If you hit the quota, you will receive an HTTP 429 error code. Your agent must implement an exponential backoff retry mechanism to continue running.

**How do I authenticate my client when using `search_by_section`?**
Authentication is handled via your registered API key, which you provide during the initial setup phase. Your AI client routes this credential through Vinkius so the tool can execute the query on your behalf.

**If I use `list_tags` and get no results for a specific keyword, does that mean there is no content?**
No. An empty tag list means that The Guardian's archive currently has no articles matching that exact filter. You should try broadening the query or using the general `search_content` tool instead.