Firecrawl MCP. Extract structured content from any website.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
Firecrawl. Turn any website into clean, LLM-ready Markdown with a single API call. This server lets your AI agent scrape single pages, crawl entire sites, map site structures, and search the live web—all into structured data for processing.
Stop dealing with messy HTML and start feeding clean content directly to your models.
What your AI agents can do
Crawl site
Crawl an entire website and extract content from multiple pages. Use this for large-scale indexing, returning a job ID for tracking.
Map site
Discover all URLs on a website without scraping content. Use this to understand a site's full structure before scraping anything.
Scrape page
Scrape a single web page and extract its content as clean Markdown. Use this for reliable article or documentation extraction.
You run scrape_page on a URL, and it returns the clean Markdown text from that single web page.
You run search_web on a topic, and it returns scraped content from the top web search results.
You run map_site on a domain, and it returns a list of all discovered links without pulling any content.
You run crawl_site on a domain, and it returns a job ID to track the process of scraping multiple internal pages.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
Firecrawl MCP Server: 4 Tools for Web Data Extraction
Use these tools to scrape single pages, crawl entire websites, map site structures, or search the live web, all returning clean Markdown data for your agent.
019d7599crawl site
Crawl an entire website and extract content from multiple pages. Use this for large-scale indexing, returning a job ID for tracking.
019d7599map site
Discover all URLs on a website without scraping content. Use this to understand a site's full structure before scraping anything.
019d7599scrape page
Scrape a single web page and extract its content as clean Markdown. Use this for reliable article or documentation extraction.
019d7599search web
Search the web and return scraped content from the top results. Use this for quick fact-checking or targeted research.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Firecrawl, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
This server lets your AI agent pull clean, LLM-ready Markdown from any website with one API call. You're done dealing with messy HTML. You can scrape single pages, crawl whole sites, map site structures, and search the live web—all into structured data your models can use.
Scraping Single Pages
When you run scrape_page on a URL, it hands you the clean Markdown text from that single web page. You'll use this for articles, documentation, or product pages.
Searching Live Results
If you run search_web on a topic, it gives you scraped content from the top web search results. This is great for quick fact-checking or targeted research.
Mapping Site Structures
Run map_site on a domain, and it returns a list of every discovered link without pulling any content. You need this to understand a site's full structure before you start pulling data.
Indexing Entire Websites
To index a whole website, you run crawl_site on a domain. This returns a job ID, which you use to track the process of scraping multiple internal pages.
How Firecrawl MCP Works
- 1 Subscribe to the server and provide your Firecrawl API key.
- 2 Your agent invokes a specific tool (e.g.,
scrape_page) with the target URL. - 3 The server executes the request, handles rendering, and returns the clean, structured Markdown content.
The bottom line is, your agent gets a reliable way to read and structure data from any website, regardless of how complex the underlying web code is.
Who Is Firecrawl MCP For?
The data engineer who needs reliable pipelines to convert web content into structured data. The research team needing to automate multi-source web research. The AI agent developer building tools that must read and understand any public web page.
Builds ingestion pipelines that reliably convert messy web content into structured, clean data formats.
Automates web research by running search_web and crawl_site across multiple sources to build a comprehensive knowledge base.
Gives agents the core ability to read, understand, and reference any external web page or document source.
What Changes When You Connect
- Scrape single pages reliably. Need the text from a single article?
scrape_pagehandles JavaScript, anti-bot measures, and cookie banners automatically. You get clean Markdown, period. - Index whole sites efficiently. Don't manually list URLs. Use
crawl_siteto recursively traverse an entire domain, perfect for indexing large documentation or product catalogs. - Find the structure first. Before running a full crawl, use
map_siteto get a sitemap of every possible URL. This saves compute time and helps you scope the job. - Gather info fast. Instead of clicking through Google results, use
search_web. It runs a search and extracts the content from the top results in one step. - Reduce data prep time. By converting all web content directly to clean Markdown, your agent skips the messy HTML parsing step, feeding models pure text.
- Build robust pipelines. The combination of
map_site(structure) andcrawl_site(content) lets you build reliable data pipelines for any site.
Real-World Use Cases
Competitive analysis of a new product
A market researcher needs to compare three competitor websites. They first run map_site to understand the scope of all three domains. Then, they use crawl_site on each site to pull all core documentation. Finally, they run search_web to find the latest press releases, getting a full comparison dataset.
Building a company knowledge base
An internal documentation team needs to index their entire company wiki. They use crawl_site on the internal domain. This gathers all blog posts and API reference pages into one corpus, giving the agent a comprehensive, searchable knowledge base.
Fact-checking a niche claim
A student needs to verify a complex scientific claim. Instead of reading multiple Wikipedia pages, they run search_web for the claim. The agent returns content from the top three academic sources, allowing the student to immediately analyze the evidence.
Updating product documentation
A product manager is updating a product page. They use scrape_page on the old page URL to grab the current content, then run map_site to see if any deep-linked supporting pages were missed, ensuring nothing is left out of the rewrite.
The Tradeoffs
Scraping one page at a time
Manually running scrape_page for every URL found on a site. This is slow, requires knowing all URLs upfront, and is impossible for sites with dynamic content.
→
First, run map_site to generate a full sitemap of the domain. Then, use crawl_site to process the entire site in bulk, letting the server handle the link discovery and content extraction.
Relying on simple search APIs
Using a basic search API that only returns snippets or links. This leaves the agent unable to read the actual context or supporting details needed for RAG.
→
Use search_web. It combines web search results with full content scraping, giving your agent the necessary context to reason about the topic.
Ignoring site architecture
Telling the agent to scrape content without first knowing if the site uses subdomains or if certain pages are protected behind a login wall. The scrape will fail silently.
→
Always run map_site first. It discovers the site's full URL structure, letting you plan which specific endpoints you need to scrape or crawl.
When It Fits, When It Doesn't
Use this if your goal is to ingest web data for structured reasoning. You need to convert messy, unstructured HTML into clean, processable Markdown.
Use map_site when you need to know the boundaries of a website—you need the map before you start building.
Use search_web when your goal is quick research on a specific topic, not indexing a whole site.
Use scrape_page when you know the exact URL and only need the content from that single page.
Use crawl_site when you need to index an entire site. Don't use this if you only need one page's content; it's overkill.
Don't use this if your task is purely conversational and doesn't involve external data retrieval. For that, your agent needs a different tool.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Firecrawl. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 4 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Web research used to be a nightmare of tabs and copy-pasting.
Before Firecrawl, getting data from the web meant opening a dozen tabs, clicking through cookie banners, and manually copying text into a spreadsheet. If the content was behind JavaScript, you often ended up with incomplete, messy data that required hours of cleanup.
Now, your agent simply calls `scrape_page` with a URL. The server handles the rendering, the anti-bot junk, and the extraction. You get clean, structured Markdown—ready for the LLM—in one go.
Firecrawl MCP Server: Full Web Indexing
Trying to build a knowledge base used to require tedious steps: first, listing all URLs, then calling a scraper for each one, and finally, stitching the content together. This was brittle and failed if any link was missed.
Now, you just run `crawl_site`. It handles the entire graph traversal and content extraction for you. You get a complete, indexed corpus with minimal fuss. It's just that simple.
Common Questions About Firecrawl MCP
How do I use `scrape_page` with Firecrawl? +
Just pass the specific URL to scrape_page. It automatically extracts the clean Markdown from that page, ignoring scripts and banners.
Is `crawl_site` better than `map_site`? +
map_site only returns links (the map). crawl_site actually scrapes the content from those links. Use map_site first if you just need to know the site structure, then use crawl_site if you need the data.
Can I scrape a whole domain with Firecrawl? +
Yes, use crawl_site. This tool recursively crawls the site, extracting content from multiple pages. It's designed for full site indexing.
What is the best way to find information on a topic using Firecrawl? +
Use search_web. It combines a Google-like search with automatic content extraction, giving you the full text from the top results, not just snippets.
How do I handle rate limits when I use `crawl_site`? +
The server manages rate limits automatically. If you exceed the limit, the API returns a 429 error, and your agent should implement an exponential backoff retry strategy. This ensures your workflow continues without manual intervention.
Does `scrape_page` handle complex JavaScript rendering? +
Yes, it handles complex JavaScript rendering. Firecrawl automatically executes the page's JS to get the fully rendered content before extracting it to clean Markdown. This means you get the real content, not just the initial HTML.
What is the difference between `map_site` and `crawl_site`? +
They serve different purposes. map_site returns a sitemap of all discovered links without scraping content. You use this first to understand the site structure; then, you run crawl_site to actually extract the content from those links.
How does `search_web` extract content from search results? +
The tool performs a Google-like search and then scrapes the full content from the top results. It doesn't just give you links; it gives your agent the actual text needed for analysis, making fact-checking easy.
How does Firecrawl pricing work? +
Firecrawl uses a credit-based system. You get 500 free lifetime credits to start (no credit card required). Base cost is 1 credit per page scraped. Advanced features like JSON extraction (+4 credits) or enhanced mode (+4 credits) consume additional credits per page. Paid plans start at $16/month with 3,000 monthly credits.
Can Firecrawl handle JavaScript-heavy websites? +
Yes! Firecrawl renders pages in a full browser environment before extracting content — this means it handles React, Next.js, Angular, and any other JavaScript framework. It also automatically bypasses common anti-bot protections, removes cookie consent banners, and waits for dynamic content to load before extraction.
What formats does Firecrawl return? +
Firecrawl can return content in multiple formats: Markdown (default and most popular for LLM consumption), HTML, raw HTML, structured JSON (with LLM-powered extraction), screenshots, links, and page metadata. You can request multiple formats in a single call.
Multi-server workflows that include Firecrawl MCP
Build Serverless Data Warehouses Using MCP
You scrape data into CSV files that nobody queries , Firecrawl extracts structured web data, Neon stores it in serverless PostgreSQL you can query with SQL, and Sheets visualizes the results
MCP Servers for Self-Updating Research Bases
You spend 3 hours reading 40 articles to write one research brief , an AI agent with Firecrawl reads all 40 in 90 seconds, stores them semantically in Weaviate, and writes the brief in Notion with every source linked and every claim verified
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
Trigger.dev
Equip your AI agent with direct access to Trigger.dev — manage background jobs, monitor task runs, and inspect workflow executions without opening the dashboard.
Zapier
Monitor automated workflows, audit app connections, and search for Zap templates on Zapier — the leader in AI orchestration.
Arize AI
Monitor ML model performance, detect data drift, and troubleshoot prediction quality with real-time observability dashboards.
You might also like
FieldAware
Connect FieldAware to automate field service management — manage jobs, customers, and invoices directly from your AI agent.
RateUpdate
Manage hotel rates and availability across OTAs and booking channels with centralized distribution and pricing intelligence.
OpenFarm Agriculture
Access crowdsourced agricultural data — search for crops, growing guides, and planting instructions.