Internet Archive Search MCP. Find any piece of digital history, filtered by decade, creator, or topic.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
Internet Archive Search MCP Server. Search 40M+ items across the Internet Archive using advanced, filtered queries. Find everything from old films and public domain books to specific NASA documentation, limited only by the archive's massive collection.
Use specialized tools like `search_by_creator` or `search_by_date_range` to narrow results down instantly.
What your AI agents can do
Faceted search
Analyzes search results by category, showing how many items fall into different types, collections, or creators.
Search
Performs broad, universal searches across all 40M+ items, supporting AND, OR, NOT logic and wildcards.
Search by collection
Finds all items within a specified themed collection in the Internet Archive.
Use search_by_creator to gather all works by a person or organization, regardless of topic or date.
Use search_by_date_range to find content only from specific decades or year ranges.
Use faceted_search to see how search results are composed across categories like media type, collection, or creator.
Use search_by_subject to find items related to curated keywords like 'world war 2' or 'jazz music'.
Use search_by_mediatype to search only for specific formats, like 'text' or 'movie'.
Use search_by_publisher to find all content published by a specific company.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
Internet Archive Search: 12 Tools for Deep Discovery
Use these specialized tools to filter, analyze, and find specific data points across the massive Internet Archive collection.
019d75b6faceted search
Analyzes search results by category, showing how many items fall into different types, collections, or creators.
019d75b6search
Performs broad, universal searches across all 40M+ items, supporting AND, OR, NOT logic and wildcards.
019d75b6search by collection
Finds all items within a specified themed collection in the Internet Archive.
019d75b6search by creator
Searches for all content associated with a specific person or organization.
019d75b6search by date range
Filters results by defining a start and end year range for historical content.
019d75b6search by language
Limits the search results to content written in a specific language.
019d75b6search by mediatype
Narrows the search to a specific format, like audio, text, or software.
019d75b6search by publisher
Finds all content associated with a specific publishing house.
019d75b6search by subject
Locates items based on curated topics or keywords assigned to the content.
019d75b6search fulltext
Runs a search across item descriptions and metadata to find specific terms.
019d75b6search recent
Retrieves the most recently uploaded items to the Internet Archive.
019d75b6search top downloads
Gets the most popular and most downloaded items from the Internet Archive.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Internet Archive Search, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
You're talking about the Internet Archive. This MCP Server lets your AI client search over 40 million items—it's a deep dive into everything from old books and films to NASA docs. You don't just type in a keyword; you get surgical control over the search. search lets your agent run universal searches across all items, handling complex logic like AND, OR, and NOT, plus wildcards. search_fulltext runs a search across the item's descriptions and metadata to nail down specific terms. search_by_creator gathers all content linked to a specific person or organization, no matter the topic or date.
You can use search_by_date_range to limit results to specific decades or years. Need stuff from a particular format? search_by_mediatype narrows the search to formats like 'text' or 'movie'. Find content from a certain company with search_by_publisher. You can target items related to curated topics using search_by_subject. Want only content in a specific language? search_by_language handles that. search_by_collection finds all material within a designated themed collection.
To keep it fresh, search_recent pulls the most recently uploaded items, and search_top_downloads pulls the most popular stuff. For historical context, search_by_date_range lets you pick a start and end year. faceted_search analyzes search results, showing you how many items fall into different types, collections, or creators. search_by_mediatype lets you filter by specific formats. search_by_subject finds items based on curated keywords. search_by_creator gathers all works by a person or organization.
You're finding everything from old public domain literature to specific academic records.
How Internet Archive Search MCP Works
- 1 Tell your agent what you need. For example: 'Find science fiction films from the 1950s by George Orwell.'
- 2 The agent calls the necessary tools (
search_by_subject,search_by_date_range, andsearch_by_creator) and passes the parameters. - 3 The server returns a list of results, identifying the items and providing metadata about their format, source, and date.
The bottom line is your agent runs multiple, highly filtered searches across the massive archive, returning only the data that matches all your criteria.
Who Is Internet Archive Search MCP For?
The academic researcher who needs to cross-reference content from decades and disciplines; the digital archivist building a knowledge graph; the historical analyst needing to track content by publisher or creator. These users are tired of manual database filtering and need deep, structured discovery.
Runs complex queries to find niche, historical data, combining criteria like 'WWII' (subject), 'audio' (media type), and '1940s' (date range).
Systematically inventories vast collections, using tools like faceted_search and search_by_collection to map out content gaps or trends.
Tracks the output of specific entities, using search_by_creator or search_by_publisher to build a timeline of influence.
What Changes When You Connect
- Pinpoint exact content using
search_by_subjectandsearch_by_date_range. Instead of getting millions of general results, you get a focused list of, say, 'civil rights' films from the 1960s. - Understand the scope of your search results with
faceted_search. This tool doesn't just return data; it shows you how many results exist for every category (like media type or collection) in the result set. - Track specific entities easily. Use
search_by_creatorto pull every work by a person or organization, skipping the need to search by name repeatedly across different topics. - Discover what's trending. Running
search_top_downloadsquickly shows you the most popular content, filtering it further bysearch_by_mediatypeif you only want to see, say, popular audio files. - Handle massive data sets.
searchsupports complex logic (AND, OR, NOT) and wildcards, letting your agent build highly specific queries that simple keyword searches can't manage. - Stay current on the archive.
search_recentlets you check what was uploaded in the last hour, useful for monitoring developing topics or monitoring niche data streams.
Real-World Use Cases
Researching early cinematic trends
A film student needs to see what kind of films were popular in the 1930s. They ask their agent to use search_by_date_range (start 1930, end 1939) combined with search_by_mediatype (video). The agent runs the query and returns a list of film shorts, giving the student a clear starting point for analysis.
Tracking a historical company's output
A market researcher wants to see all content related to 'General Motors' across all media types. They ask the agent to use search_by_creator with 'General Motors' and then run faceted_search to see if the results are more often books or images. This helps them map the company's historical digital footprint.
Finding content in a niche language
You need a specific historical document written in Portuguese. You ask the agent to use search_by_language ('Portuguese') and then narrow it down by search_by_subject ('literature'). The agent executes both filters, returning only the relevant literary works in that language.
Building a content timeline
A journalist needs to write about the evolution of space exploration. They ask the agent to use search_by_subject ('space exploration') and then use search_by_date_range (1950-2000). Finally, they use search_by_mediatype (images) to gather a visual timeline of the topic.
The Tradeoffs
Over-relying on simple keyword search
Just searching 'science fiction' in a general chat prompt. This gets thousands of results, mixing films, books, and random articles, forcing you to manually sort by decade or format.
→
Use search_by_subject('science fiction') first, then combine it with search_by_date_range('1960', '1969') and search_by_mediatype('text'). This immediately narrows the scope to the precise content you need.
Searching by title only
Asking for 'Gutenberg' content but only searching titles. You miss the related articles, images, or metadata that describe the works.
→
Use search_by_collection('Project Gutenberg') for the core works, but follow up with search_by_creator('Project Gutenberg') to find all associated metadata and supplemental material.
Forgetting to check popularity
Wasting time finding obscure content when you really need to see what the public considers important or popular.
→
Start with search_top_downloads. You can then narrow those popular results using search_by_mediatype to focus on texts or films.
When It Fits, When It Doesn't
Use this MCP Server if your goal is deep, highly filtered data discovery. You need to combine multiple axes of data (e.g., Topic AND Creator AND Decade). The specialized tools are designed for this complexity. Don't use it if you are only trying to find a simple, single fact (e.g., 'What is the capital of France?'). For simple facts, a standard search engine is faster. If you need to understand the composition of a result set—for example, seeing how many results are films vs. books—you must use faceted_search. If you only care about what's new, use search_recent. If the content is highly specialized, you'll need to chain tools like search_by_subject $\rightarrow$ search_by_date_range $\rightarrow$ search_by_mediatype.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Internet Archive Search. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 12 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Finding specific historical media shouldn't require jumping between five different database tabs.
Right now, if you're researching a topic, you open the main archive site. You search by keyword. Then you have to manually filter by date, then switch to a different tab to filter by format, and maybe open a third tab just to see the list of creators. You're copy-pasting parameters and clicking through dozens of dropdown menus until you find the right combination.
With this MCP Server, your agent handles that entire multi-step process in a single API call. You tell it: 'I need audio recordings about civil rights from the 1970s.' The agent automatically runs `search_by_subject` and `search_by_date_range` and `search_by_mediatype` in sequence. You get the clean list, no clicking required.
Internet Archive Search MCP Server: Deep Discovery with 12 Tools
You don't have to settle for broad searches. If you need to track everything by a specific person, you run `search_by_creator`. If you need to analyze the *entire* scope of the result set, you run `faceted_search`. The toolset lets you methodically build your query from multiple angles—publisher, language, subject, and more.
This server gives you the raw, filtered power of a professional archivist's workstation, right inside your agent's chat window. It's about getting deep, specific data without the UI friction.
Common Questions About Internet Archive Search MCP
How do I use the `search` tool for a broad search? +
Use the search tool when you are unsure of the exact filters. It supports complex query logic (AND, OR, NOT) and wildcards, letting you cast a wide net across all 40M+ items.
Can I find all works by a specific author using `search_by_creator`? +
Yes. search_by_creator pulls every item linked to that person or organization, making it ideal for building a complete bibliography or content timeline.
Is `search_by_subject` the same as `search_fulltext`? +
No. search_by_subject finds content based on curated, assigned topics (like 'jazz music'). search_fulltext searches for specific keywords embedded anywhere in the item's description or metadata.
How do I filter by format type using `search_by_mediatype`? +
Simply pass the desired format (e.g., 'audio', 'text', 'movie') to search_by_mediatype. This immediately removes all other formats from the results set.
What is the best way to find popular content? +
Use search_top_downloads. You can refine this search further by adding a media type filter to narrow down the results to only, say, popular texts.
How do I use `search_by_date_range` to find content from a specific decade? +
You combine search_by_date_range with the desired query and years. For example, use query="world war 2", startYear="1939", and endYear="1945" to narrow results to that conflict's timeframe.
Can I use `faceted_search` to analyze the composition of my search results? +
Yes, faceted_search analyzes results by category. You pass JSON faceting syntax (e.g., mediatype:{type:terms,field:mediatype}) to see how the results break down by media type, collection, or creator.
What is the difference between `search` and `search_fulltext`? +
The search tool handles broad discovery with universal query syntax (AND, OR, NOT). Use search_fulltext when you need to find items based on specific keywords appearing within the item's description and metadata.
What search syntax does the Internet Archive support? +
The IA search uses Solr-like syntax: AND, OR, NOT for boolean logic, wildcards (*), phrase matching ("..."), and field-specific searches like creator:"Name", subject:"Topic", collection:"name". Combine multiple criteria for precise results.
What collections are available? +
Major collections include: prelinger (ephemeral films), gutenberg (free ebooks), nasa (space images/videos), tv (TV news archive), fedflix (government films), netlabels (independent music), softwarelibrary (classic games/apps), and thousands more community collections.
Can I search by date range? +
Yes! Use search_by_date_range with a query, start_year, and end_year. Example: query="science fiction", start_year="1950", end_year="1959" finds all sci-fi from the 1950s.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
Botsonic
Train custom AI chatbots on your own data to deliver instant, accurate support that learns from your knowledge base.
GitScrum Knowledge
Build and query knowledge bases via GitScrum — manage notes as agent memory, maintain wiki pages, communicate through discussions, and search across all resources from any AI agent.
Trefle
Access the world's largest botanical database — search for plants, species, and genera, and explore distribution data directly from your AI agent.
You might also like
Happierleads
Connect Happierleads to any AI agent via MCP.
Faker
Generate high-quality mock data for development and testing — including addresses, persons, companies, and products in multiple locales.
UptimeRobot
Monitor and manage your website uptime seamlessly. List, create, and resolve monitor alerts directly from your AI agent, 24/7.