Internet Archive MCP. Search 40M+ historical records and media types.

Q: How can I use the Wayback Machine to find archived websites snapshots?

Use the waybackavailability tool with any full URL (e.g., "https://example.com"). It returns the closest archived snapshot with its timestamp. The archived page can be viewed at: https://web.archive.org/web/{timestamp}/{originalurl}. Note: Not all URLs are archived — the Wayback Machine selectively crawls and saves web pages.

Q: How do I use the search tool to find content from a specific decade?

Use the search tool with a combination of keywords and the startYear and endYear parameters. For example, query="space exploration", startYear="1960", endYear="1969" will isolate content from that period.

Q: If I know the creator, how do I use searchbycreator to find all their works?

Just provide the creator's name directly to searchbycreator. This function pulls all available items—films, books, or images—linked to that specific author or organization.

Q: What kind of metadata can I get using getitemmetadata?

The getitemmetadata tool returns a full data dump, including the title, creator, date, description, subjects, collection names, license type, and download statistics.

Q: How can I use getitemfiles to find all download options for an item?

Pass the item's unique ID or URL to getitemfiles. It lists every available format (like PDF, MP4, MP3) and the download link structure for that specific item.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Just plug in your AI agents and start using Vinkius.

Internet Archive MCP Server provides access to the world's largest digital library (40M+ items). Use this server to search across books, videos, audio, software, and historical web snapshots via the Wayback Machine.

You can retrieve item metadata, view download stats, and read community reviews, all from a single conversational interface. It's designed for deep research and content discovery.

What your AI agents can do

Get item files

Lists all downloadable file formats (PDF, MP4, MP3, etc.) and sizes for a given Internet Archive item ID.

Get item metadata

Retrieves comprehensive data on an item: title, creator, dates, subjects, and full file listing.

Get item reviews

Gets community reviews, including star ratings and text, for a specific Internet Archive item.

+ 7 more capabilities included

Search the entire library

Search across all media types, creators, and dates using complex query logic (AND, OR, NOT).

Check historical website versions

Determine if a specific URL has been archived by the Wayback Machine and find the closest snapshot date.

Get detailed item facts

Retrieve complete metadata, including subjects, file formats, and download links, for any specific item ID.

Analyze item popularity

Measure the total views and daily view counts for an archived item.

Browse curated content sets

Focus your search instantly on known collections like Project Gutenberg or Prelinger Archives.

Review community reception

Pull specific user reviews and star ratings for an archived item.

Ask AI about this MCP

Ask ChatGPT

Ask Claude

Ask Perplexity

Supported MCP Clients

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

+ other MCP clients

Free for Subscribers

Limits the search results to a specific format, such as 'movies,' 'texts,' or 'audio'.

wayback019d75ba

wayback availability

Checks if a given URL has an archived snapshot using the Wayback Machine and returns the closest date.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Internet Archive, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 4,700+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

What you can do with this MCP connector

You're hooking up your AI client to the Internet Archive, which gives you access to the world's largest digital library—over 40 million items. This isn't just a search tool; it's your deep dive into history. You can search through books, videos, audio, software, and even historical snapshots of websites via the Wayback Machine, all from one chat session.

Search the entire library: You can search across all media types, creators, and dates using complex query logic like AND, OR, and NOT. Search by collection: You can narrow your focus instantly to curated groups like Project Gutenberg or Prelinger Archives. Search by creator: You can find every piece of content linked to a specific author, director, or organization. Search by media type: You can limit results to a specific format, like 'movies,' 'texts,' or 'audio.' Check historical website versions: You can use wayback_availability to see if a URL has been archived by the Wayback Machine and grab the closest snapshot date.

Get detailed item facts: For any specific item ID, get_item_metadata retrieves complete data, giving you the title, creator, dates, subjects, and the full file listing. You can then use get_item_files to list every downloadable file format and its size (like PDF, MP4, MP3). Review community reception: get_item_reviews pulls specific user reviews and star ratings for an item. Analyze item popularity: get_views_stats measures the item's total views and daily view counts, plus a geographic breakdown.

How Internet Archive MCP Works

1 First, tell your agent what you're looking for (e.g., 'World War II films').
2 The agent runs the appropriate search tool, which returns a list of item IDs and basic data.
3 You then pass a specific item ID to a detail tool (like get_item_metadata) to pull the full file list, review scores, or view stats.

The bottom line is, you use a few initial search tools to find the item, and then specialized tools to pull specific data points about that item.

Who Is Internet Archive MCP For?

Researchers, journalists, and content creators rely on this server. If your job requires accessing primary sources, verifying historical website claims, or sourcing public domain media, you need this. It cuts out the hours spent navigating different library databases and manual cross-referencing.

Academic Researcher

Uses search and get_item_metadata to locate rare academic papers or historical documents by specific creator or date range.

Investigative Journalist

Runs wayback_availability to check if a website or article existed on a specific date, and uses search to find related archived news reports.

Content Creator

Uses search_by_collection and get_item_files to find public domain films, music, or images for a new project.

What Changes When You Connect

Find content from any era: Instead of manually checking decade-specific databases, use search_by_date_range to filter content from specific years (e.g., '1950-1959').
Verify web history instantly: Use wayback_availability to see if a URL was active years ago, getting the precise snapshot date from the Wayback Machine.
Deep dive on single items: After finding an item, run get_item_metadata to get the full details, file formats, and download links in one call.
Filter by content type: Don't sift through mixed results. Use search_by_mediatype to get only 'movies,' 'texts,' or 'audio' results immediately.
Track content popularity: Measure how widely an item was seen by calling get_views_stats, giving you reach metrics that standard search results omit.
Explore curated sets: Use search_by_collection to jump straight into trusted archives, like the Prelinger Archives or NASA image sets.

Real-World Use Cases

Tracing a Website's Evolution

A journalist needs to verify a claim made on a defunct website. They run wayback_availability on the URL. The agent finds the closest snapshot date and provides the link. They then use search with the topic and date range to find other archived news articles from that same period for context.

Building a Film History Database

A student researches public domain films. They use search_by_collection for 'Prelinger Archives,' then search_by_date_range to narrow it to the 1930s. Finally, they call get_item_metadata to list all available formats (MP4, OGV) for download.

Finding Source Material for a Documentary

A content creator needs NASA imagery. They use search_by_collection for 'NASA' and then filter by search_by_mediatype 'image'. They use get_item_reviews to gauge the community interest or quality of the source material before using it.

Researching a Specific Author's Output

A historian wants all works by a specific author. They run search_by_creator for the name. They then use search with a complex query (e.g., 'subject:Cold War' AND creator:AuthorName) to refine the results and check the get_item_files for available PDFs.

The Tradeoffs

Searching Everything At Once

Trying to remember every single search parameter (creator, date, collection, media type) and throwing them all into a single, complex natural language prompt.

→ Don't try to do it all at once. Start with the broadest search using search with only the core topic. Then, narrow it down methodically, running targeted calls like search_by_mediatype or search_by_collection to refine the result set.

Assuming Data Completeness

Finding an item ID and immediately assuming you have all the details, forgetting to check for reviews or file types.

→ Always pair get_item_metadata with get_item_reviews and get_item_files. That combination gives you the full picture: what it is, what people thought of it, and what you can actually download.

Only Checking the Surface

Running a basic search and stopping there, missing historical context or download options.

→ After your initial search, always run get_views_stats to gauge its popularity. Then, use wayback_availability if the content is historical, or get_item_metadata to ensure you have all the necessary details.

When It Fits, When It Doesn't

Use this server if your job requires deep, verifiable historical research or accessing public domain media archives. You need it when you must answer questions like: 'What did this website look like in 2005?' or 'What were all the films made by Director X in the 1940s?'.

Don't use this if you are just looking for general, current information (e.g., today's stock prices or a person's current phone number). For live data or real-time services, use a dedicated API for that domain. This server is for historical, archived, and academic content only.

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Internet Archive. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 10 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

get_item_files get_item_metadata get_item_reviews get_views_stats search search_by_collection search_by_creator search_by_date_range search_by_mediatype wayback_availability

Trying to manually track down old media and archival data is a huge waste of time.

Before this server, finding historical records meant bouncing between library databases, government archives, and web-specific tools. You'd run a basic search, get a list of IDs, then have to manually check each one for its file types, its history, and its creator details—a process that takes hours and requires copy-pasting dozens of IDs into multiple forms.

Now, you ask your agent to find the material. It runs `search` and `search_by_date_range` to narrow the scope. It then automatically calls `get_item_metadata` and `get_item_files` to give you a comprehensive, actionable list of every single format and download link, all without you touching a browser or copy-pasting a single ID.

Internet Archive MCP Server: Get the full context on any item.

Before, finding an item meant getting its title. After, you get the full story. You can run `get_item_metadata` for the basics, but you also need `get_item_reviews` to know if the community liked it, and `get_views_stats` to know how widely it was seen. The item's context—its popularity and reception—is just as important as the files themselves.

The server doesn't just give you data points; it gives you a full profile. You get the file formats, the historical context, and the user feedback, all organized and ready to use. It's the difference between having a pile of files and having a usable report.

Common Questions About Internet Archive MCP

Is any authentication required to use the Internet Archive API? +

No! All search, metadata, and Wayback Machine features are completely free and public — no API key or account needed. You can search 40M+ items, get item details, and check archived URLs immediately. Authentication is only required if you want to upload content (which this MCP server doesn't support).

How do I find and download files from an archived item? +

First, use search to find items matching your query and note the identifier (e.g., "big_buck_bunny"). Then use get_item_files to see all available files with their formats (PDF, MP4, MP3, etc.). Files can be downloaded directly from: https://archive.org/download/{identifier}/{filename}. Many items offer multiple formats for the same content.

How can I use the Wayback Machine to find archived websites snapshots? +

Use the wayback_availability tool with any full URL (e.g., "https://example.com"). It returns the closest archived snapshot with its timestamp. The archived page can be viewed at: https://web.archive.org/web/{timestamp}/{original_url}. Note: Not all URLs are archived — the Wayback Machine selectively crawls and saves web pages.

What collections are available in the Internet Archive? +

Major collections include: Prelinger Archives (ephemeral films), Project Gutenberg (free ebooks), NASA (space images and videos), TV News Archive, FedFlix (government films), Open Source Movies, Netlabels (independent music), Software Library (classic games and apps), American Libraries, Biodiversity Heritage Library, and thousands of community collections. Use search_by_collection to explore any collection.

How do I use the `search` tool to find content from a specific decade? +

Use the search tool with a combination of keywords and the startYear and endYear parameters. For example, query="space exploration", startYear="1960", endYear="1969" will isolate content from that period.

If I know the creator, how do I use `search_by_creator` to find all their works? +

Just provide the creator's name directly to search_by_creator. This function pulls all available items—films, books, or images—linked to that specific author or organization.

What kind of metadata can I get using `get_item_metadata`? +

The get_item_metadata tool returns a full data dump, including the title, creator, date, description, subjects, collection names, license type, and download statistics.

How can I use `get_item_files` to find all download options for an item? +

Pass the item's unique ID or URL to get_item_files. It lists every available format (like PDF, MP4, MP3) and the download link structure for that specific item.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript