# Internet Archive MCP

> Internet Archive MCP connects your AI agent to the world’s largest digital library, accessing 40 million+ items in one chat session. Search everything—books, films, music, software, and historical web pages via the Wayback Machine—using natural conversation instead of complicated search forms.

## Overview
- **Category:** brain-trust
- **Price:** Free
- **Tags:** digital-library, wayback-machine, archival-data, metadata-search, historical-records, open-access

## Description

This connector gives your agent access to an immense historical data vault. You don't need to learn complex query syntax or navigate endless menus; you just ask for what you want. Whether you’re looking for academic papers from the 1920s, public domain films, or a snapshot of how a website looked ten years ago, your agent finds it automatically. It pulls data on everything: file formats, subject matter, and community reviews. Because Vinkius hosts this MCP, your client connects once to access this vast resource alongside thousands of other specialized tools.

It’s all about natural conversation. You tell the AI what you need—a dataset from a specific decade, or the original source code for old software—and it handles the deep search and data aggregation process for you.

## Tools

### search_by_collection
Search for items within specific curated categories like Project Gutenberg ebooks or Prelinger Archives.

### search_by_creator
Find all content created by a particular author, organization, or artist name.

### search_by_date_range
Filter results to find content from specific historical eras or decades using start and end years.

### search_by_mediatype
Limit your search to only one format, such as movies, audiobooks, images, or software.

### get_item_files
List all available download formats (PDF, MP4, etc.) and file sizes for a specific item ID.

### get_item_metadata
Get complete details about an item, including its title, subjects, publisher, license, and total view count.

### get_item_reviews
Retrieve community reviews and star ratings to gauge how useful or well-received a specific archived item is.

### get_views_stats
Measure the popularity of an item by getting its total view count and, if available, geographical breakdown.

### search
Perform broad searches across all media types using complex syntax like AND or title:"query" for maximum flexibility.

### wayback_availability
Check if a given URL has been archived and find the closest available snapshot date via the Wayback Machine.

## Prompt Examples

**Prompt:** 
```
Search for public domain films from the 1940s.
```

**Response:** 
```
Found 12,847 films from the 1940s in the Prelinger Archives and community collections. Notable items: "It's a Wonderful Life" (1946), "The Scarlet Letter" (1934), various WWII propaganda and newsreel films, educational shorts, and home movies. Most are available for free download in MP4 and OGV formats. Would you like details on any specific title?
```

**Prompt:** 
```
Check if https://example.com has been archived.
```

**Response:** 
```
✅ URL is archived! Closest snapshot: 20240101120000 (January 1, 2024). View the archived page at: https://web.archive.org/web/20240101120000/https://example.com. The Wayback Machine has captured this page multiple times over the years.
```

**Prompt:** 
```
Show me all NASA images available.
```

**Response:** 
```
Found 185,000+ NASA items in the Internet Archive. The NASA collection includes: Apollo mission photographs, Hubble Space Telescope images, satellite imagery, astronaut training footage, Space Shuttle launches, ISS documentation, and planetary exploration photos. Items span from the 1960s to present day. Would you like to narrow by specific mission, decade, or image type?
```

## Capabilities

### Find content across massive collections
The agent searches pre-curated categories like Project Gutenberg ebooks or NASA images without needing to specify the collection name.

### Verify historical website versions
It checks if a given URL has been archived, returning the closest available snapshot date and link through the Wayback Machine.

### Search by specific criteria
You can narrow searches down to items created by an author or organization, or filter results only for movies, audio, or texts.

### Gather item details and stats
The agent retrieves full metadata—including subjects, file formats, and download statistics—for any found item.

### Review community feedback
It pulls user ratings and review texts from the community to help you assess an item's quality or relevance.

## Use Cases

### Tracing website evolution for journalism
A journalist needs to prove a company's messaging changed drastically in 2015. They ask their agent to check the URL using wayback_availability, finding multiple snapshots over time and retrieving metadata on content changes.

### Building a film history database
A student wants all public domain films from the 1940s. They use search_by_collection with 'prelinger' and then filter by date using search_by_date_range to narrow down the decade.

### Identifying original source materials
A developer is looking for old computing software. They use search_by_mediatype set to 'software' and then get_item_metadata on a promising result to check its specific file formats and download links.

### Academic literature review
A researcher needs all articles written about climate change in the 1980s. They use search_by_date_range combined with 'pubmed' (a collection) to pinpoint primary academic sources from that specific period.

## Benefits

- You get access to primary source materials. Instead of searching through limited academic databases, the search tool finds everything from old government films (Prelinger Archives) to rare scientific datasets.
- Historical verification is instant. The wayback_availability tool lets you check any URL and instantly see if it was archived, telling you exactly when that snapshot occurred.
- Data gathering becomes efficient. Use get_item_metadata to pull all the necessary citation info—creator, date, license, subject—before you even plan your download.
- You can filter by format with search_by_mediatype. Need a playlist of vintage audiobooks? You limit results only to 'audio' and find them immediately.
- It saves research time. Instead of writing complex database queries, the agent handles combining criteria like creator AND date range using the powerful search tool.

## How It Works

The bottom line is that you get deep access to a global digital library without writing any complex code or navigating multiple websites.

1. Subscribe to this MCP in Vinkius. No API key is needed; it's a public, free resource.
2. Start your request using any MCP-compatible client (like Cursor or Claude). Simply ask the agent for historical content by topic, creator, or date range.
3. The agent executes the search and returns structured data including titles, identifiers, file formats, and direct links to download resources.

## Frequently Asked Questions

**How do I use Internet Archive MCP if I don't know the exact name?**
You can start with a broad search using the main search tool. You just need to describe the topic, and the agent will help you refine it by date or media type.

**Does Internet Archive MCP handle modern websites?**
It uses the wayback_availability tool for this. If a site was online before, it checks its historical snapshots; otherwise, it won't find an archived version.

**Can I search for films and books at the same time using Internet Archive MCP?**
Yes. You can use the main search tool to combine criteria, like searching for 'climate change' AND limiting it by mediatype:movies or mediatype:texts.

**What is the best way to check a file's availability?**
Use get_item_files. This tool gives you a precise list of all available formats (PDF, EPUB, etc.) and their corresponding download links for that specific item ID.

**Is Internet Archive MCP only for American content?**
No, it covers global content. You can use search_by_collection to browse international libraries or use the main search tool with country-specific keywords.