# Internet Archive Search MCP

> Internet Archive Search lets your agent perform advanced research across the world's largest digital library. You can query over 40 million items—everything from books and films to music, software, and historical documents. Filter content by specific decades, media type, creator, or topic using complex queries (AND, OR, NOT). It's built for deep, focused archival discovery.

## Overview
- **Category:** knowledge-management
- **Price:** Free
- **Tags:** digital-library, search-engine, archival-data, information-retrieval, content-discovery, open-access

## Description

Think of this MCP as a massive research assistant that indexes everything from the last century. Instead of just searching keywords, you tell it what kind of content you need and when. Your agent can handle complicated queries using operators like AND or NOT to narrow results down instantly. Need to find all articles about civil rights published in the 1960s? Or maybe only pre-war films shot on film stock? This MCP handles that complexity across texts, videos, and audio recordings alike.

When you connect this Internet Archive Search through Vinkius, your agent gets access to a highly structured workflow. You can refine results by publisher or narrow the search down just to specific collections like NASA's records. It’s about precision discovery; it helps you bypass the noise of general web searches and go straight for primary source material.

## Tools

### faceted_search
Analyzes the composition of a result set by breaking down categories like media type, collection, or creator.

### search_by_collection
Limits results to content housed within specific themed archives or community collections.

### search_by_creator
Finds all available works from a designated author, organization, or notable figure.

### search_by_date_range
Narrows down results to only include items published within a defined start and end year range.

### search_fulltext
Performs a broad search across all 40 million items, supporting complex queries and wildcards.

### search_by_language
Retrieves content that is published specifically in a requested language, such as French or Spanish.

### search_by_mediatype
Filters the search to only show items of one specific format type, like audio or film.

### search_by_publisher
Identifies all content that originated from a particular publishing house.

### search_recent
Retrieves the most recently uploaded materials to see what new items have been added to the archive.

### search_by_subject
Searches for content using curated, general topics like 'science fiction' or 'civil rights'.

### search
Supports AND, OR, NOT, wildcards (*), and field searches. Use this for broad discovery. Optional: fields (e.g., "identifier,title,mediatype"), rows (1-100), page for pagination, and sort (e.g., "date desc").

Universal search across 40M+ items in the Internet Archive

### search_top_downloads
Finds the most popular or frequently downloaded content within specific formats like texts or movies.

## Prompt Examples

**Prompt:** 
```
Search for public domain films from the 1940s.
```

**Response:** 
```
Found 12,847 films from the 1940s in Prelinger Archives and community collections, including WWII propaganda, educational shorts, and home movies.
```

**Prompt:** 
```
Show me the most downloaded items.
```

**Response:** 
```
Top downloads include: Big Buck Bunny (movie), various Project Gutenberg ebooks, NASA Apollo mission photos, and classic software from the softwarelibrary collection.
```

**Prompt:** 
```
Search for NASA images.
```

**Response:** 
```
Found 185,000+ NASA items including Apollo mission photographs, Hubble Space Telescope images, satellite imagery, and Space Shuttle documentation.
```

## Capabilities

### Filter Search Results
Limit results by format, such as texts, movies, or software, to focus your research.

### Analyze Content Composition
Determine what types of content are present in a set of search results using JSON faceting syntax.

### Isolate Specific Creators
Find all works associated with an author, organization, or notable person.

### Target Historical Periods
Restrict searches to content created within a specific start and end year range.

### Perform Deep Text Searches
Run full-text queries across item descriptions and metadata for highly specific terms.

### Identify by Subject Matter
Search content using curated, assigned topics like 'world war 2' or 'jazz music'.

## Use Cases

### Tracking Early Cinema History
A film student needs to find all silent films from the 1920s directed by German masters. They ask their agent, specifying `search_by_mediatype` (film) and using `search_by_date_range` (1920-1929), guaranteeing they don't miss any key works.

### Verifying Corporate History
A journalist needs to prove a company changed its name and branding. They use `search_by_publisher` combined with `search_fulltext` to find all mentions of the old name in their annual reports, year by year.

### Researching Global Food Trends
A global food researcher wants to see how rice farming was discussed in different languages. They use `search_by_subject` (agriculture) and combine it with `search_by_language` for Mandarin, Spanish, and English.

### Identifying Rare Software Manuals
A tech historian needs to find documentation for obsolete operating systems. They run a search using `search_by_mediatype` (software) and then filter the results by known manufacturers via `search_by_publisher`.

## Benefits

- Pinpoint content by exact era: Use `search_by_date_range` to pull all material from a specific decade, eliminating modern noise.
- Analyze vast collections with precision. Run `faceted_search` to understand how results are distributed across different formats or topics automatically.
- Trace the work of individuals using `search_by_creator`. Find every article or book by an author without having to search their name manually dozens of times.
- Focus your media type: Use `search_by_mediatype` to pull only films, leaving out millions of irrelevant documents. Or vice versa.
- Stay current on research topics using `search_recent`. See what has been added to the archive since you last ran a query.

## How It Works

The bottom line is you get precise answers to highly specific historical or academic questions, without needing to manually click through dozens of search result pages.

1. You provide your agent with a complex research query and specify required filters, such as the date range, media type, or creator.
2. The MCP executes this multi-faceted search across the Internet Archive's 40 million+ items.
3. Your agent receives structured data that pinpoints relevant results based on all the applied criteria.

## Frequently Asked Questions

**How do I find content from specific years using the Internet Archive Search MCP?**
You use the `search_by_date_range` tool by providing a start year and an end year, alongside your main search query. This limits results to only that time period.

**Can I filter by media type using Internet Archive Search?**
Yes. You use `search_by_mediatype` to restrict your search to one format, like movies or audio recordings, making the result set much smaller and more targeted.

**What is the best way to find all work by a specific person?**
Use `search_by_creator`. This tool gathers every item associated with that author or organization name across the entire archive, regardless of date or format.

**Is Internet Archive Search good for finding rare software manuals?**
Absolutely. Use `search_by_mediatype` and then refine by keywords in the title using the general `search` tool to locate old digital artifacts.

**How do I find content about a topic without knowing the exact year?**
Start with `search_by_subject`. This uses curated topics like 'world war 2,' allowing you to gather all related materials across different time periods and formats.