Crawlbase MCP for AI. Extract Data From Any Website, No Code Required

Q: How does Crawlbase MCP handle JavaScript rendered content?

It uses specialized tools like scrapejsrendered. This means it doesn't just read the initial HTML; it waits for the page to fully load data using JS before extracting the information.

Q: Can I use scrapegoogleserp with my AI agent?

Yes. scrapegoogleserp allows your agent to identify and pull structured results from Google search pages, which is necessary for repeatable SEO research without manual searching.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

Crawlbase gives your AI agent full control over web data extraction. It handles complex sites, including JavaScript-rendered pages and social media platforms like Amazon, LinkedIn, and Facebook.

You can bypass security measures and capture structured data from almost any public website.

What your AI can do

Scrape html

Performs basic web scraping by identifying contained HTML content using datacenter proxies.

Scrape js rendered

Accesses and pulls data from modern websites that load their content dynamically using JavaScript.

Scrape json format

Converts complex, messy web data into clean, structured JSON objects.

+ 7 more capabilities included

Capture Web Screenshots

Run automated checks that generate permanent links to visual snapshots of any web page.

Extract Structured JSON Data

Force raw website outputs into precise, structured JSON formats for immediate use by your agent.

Scrape JavaScript Pages

Retrieve content from modern websites that load data dynamically using JavaScript.

Target Social Networks

Specialized extraction tools for key platforms like Amazon, LinkedIn, and Facebook.

Analyze Search Results

Identify data from Google search results pages (SERPs) while bypassing CAPTCHAs.

Create Custom Proxies

Generate and provision custom proxy endpoints with specific headers and crawling logic for high-availability requests.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

Crawlbase: 10 Web Scraping Utilities

These ten tools give you complete control over web scraping. Use them to extract specific data types, validate screenshots, or scrape entire platforms like Amazon and LinkedIn.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Crawlbase on Vinkius

Scrape Html

Performs basic web scraping by identifying contained HTML content using datacenter proxies.

Scrape Js Rendered

Accesses and pulls data from modern websites that load their content dynamically...

Scrape Json Format

Converts complex, messy web data into clean, structured JSON objects.

Get Screenshot Link

Runs automated checks to provide a permanent URL link for visual snapshots of any...

Scrape Amazon

Extracts specific product details and data points from Amazon e-commerce listings.

Scrape Linkedin

Retrieves detailed professional profile information matching LinkedIn's structural constraints.

Scrape Facebook

Retrieves structured information directly from active Facebook social pages.

Scrape Google Serp

Identifies and collects data points spanning Google search results, bypassing...

Scrape Twitter

Fetches mapped, structured data points from Twitter (X) graph profiles and timelines.

Custom Scrape

Generates custom proxy endpoints that can be used for highly reliable, targeted data...

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Crawlbase integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "crawlbase": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Crawlbase tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"crawlbase": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Crawlbase, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Crawlbase. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 10 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Copying and pasting data from complex websites is slow.

Right now, if you need a list of product specs or social media profiles, you manually visit the site. You copy the title here, check the price there, then open another tab to grab the description. It's tedious clicking through tabs and copying blocks of text into a spreadsheet.

With this MCP connected via Vinkius, your agent does it all for you. You ask for the data point—say, 'Get me the title and rating from that page.'—and you get clean, structured JSON back. It handles the navigation and extraction in one step.

Web Scraping Utilities

The manual effort of figuring out if a site is static or dynamic, dealing with rate limits, and translating messy HTML into useful data stops completely. You don't worry about the underlying code; you just worry about what information you need.

It’s simple: tell your agent the goal, and it runs the right tools—whether that’s `scrape_amazon` or a general `scrape_html` call—to get the data.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

Need to get data off the web? This MCP connects Crawlbase directly to your AI client, letting you take over tricky scraping jobs through natural conversation. Forget writing complex code or spending hours debugging anti-bot walls. You just ask for the data—the price list from a competitor's site, the profiles of key people on LinkedIn, or the specific specs of an Amazon product—and it handles the rest.

It even figures out how to read content hidden behind JavaScript and tackles search results that are constantly changing. Because Vinkius hosts this MCP in their catalog, you connect once and your agent gets access to all these web scraping capabilities. You’ll get clean JSON outputs, screenshots for validation, or full site crawls, all without touching a single line of code.

Built · Hosted · Managed by Vinkius Crawlbase-MCP - Web Scraping and Data Extraction

Server ID 019d757e-14a6-703b-af0a-60f917277e59

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

What Changes When You Connect

Get structured data without scripting. Instead of writing complex Python code to handle different site structures, you just ask your agent for the JSON format using scrape_json_format.

Handle dynamic sites easily. If a website loads its content via JavaScript—the kind of thing that breaks simple scrapers—this MCP uses specialized tools like scrape_js_rendered to get it anyway.

Bypass security challenges. Stop hitting CAPTCHAs or rate limits; the system handles search engine discovery and proxy management, even giving you custom endpoints with custom_scrape.

Target social media efficiently. Instead of manual copy-pasting from LinkedIn pages or Amazon listings, dedicated tools like scrape_linkedin and scrape_amazon pull out clean, specific data points.

Validate your work instantly. Need proof the page was scraped correctly? Use get_screenshot_link to capture a visual snapshot of exactly what your agent saw on the target site.

See it in action

01 01

Competitor Price Monitoring

A growth team needs daily price updates for five key products across Amazon. Instead of manually visiting ten different listings and entering data into a spreadsheet, they prompt their agent: 'Run scrape_amazon on these URLs.' They get a clean JSON file with all prices and ratings.

02 02

Talent Scouting

A recruiter needs to identify all professionals with specific titles from a list of companies. Instead of navigating dozens of LinkedIn profiles, they use the agent with scrape_linkedin to build a structured database of names and roles in minutes.

03 03

Deep Web Research

A researcher needs data from an old university site that doesn't display content until you run specific scripts. They use the agent, which activates scrape_js_rendered, ensuring no hidden or dynamically loaded data points are missed.

04 04

Search Engine Intelligence

A marketing professional needs to track how search results change over time. Instead of manually running Google searches and copying titles, they use the agent with scrape_google_serp for structured, repeatable data collection.

The honest tradeoffs

Building a scraper from scratch

Anti-pattern

Trying to write custom Python logic using libraries like Beautiful Soup or Scrapy just because the site is complex.

The Fix

Don't build it. Use this MCP. If you need basic HTML content, run scrape_html. For dynamic data, use scrape_js_rendered. Always start with the specialized tools first.

Ignoring anti-bot measures

Anti-pattern

A simple script fails every time it hits a CAPTCHA or gets blocked after three requests.

The Fix

Use custom_scrape to provision custom proxies and manage the request payload, allowing your agent to bypass common rate limits.

Handling messy output

Anti-pattern

Receiving raw HTML snippets that are difficult to parse or contain inconsistent data types.

The Fix

Run scrape_json_format on the result. It forces the unstructured content into a predictable, easy-to-use JSON structure.

Questions you might have

How does Crawlbase MCP handle JavaScript rendered content? +

It uses specialized tools like scrape_js_rendered. This means it doesn't just read the initial HTML; it waits for the page to fully load data using JS before extracting the information.

Can I use scrape_google_serp with my AI agent? +

Yes. scrape_google_serp allows your agent to identify and pull structured results from Google search pages, which is necessary for repeatable SEO research without manual searching.

Which tool should I use if the data is messy? +

If you get raw or inconsistent web output from any scraping attempt, run scrape_json_format. This forces the complex content into a predictable JSON structure your agent can work with.

Is scrape_linkedin good for professional data collection? +

Yes. It’s designed to retrieve detailed profile information while respecting LinkedIn's structural constraints, making it reliable for building contact lists or talent databases.

Before using `custom_scrape`, what credentials do I need to set up a proxy payload? +

You'll need your Crawlbase Normal Token. This token authenticates your connection and allows the agent to provision highly-available custom proxies, ensuring reliable payloads for all of your web crawling tasks.

If my AI agent hits rate limits while using `scrape_html`, how does Crawlbase handle it? +

The MCP manages this by utilizing its specialized proxy list and dedicated algorithms. It handles IP rotation and includes CAPTCHA solving, keeping your data collection flowing even when sites try to block you.

When I use `get_screenshot_link`, what is the purpose of capturing a web snapshot? +

The screenshot link generates a visual record of the page exactly as it appeared. This lets you validate the content extracted by other tools, confirming precisely what the headless engine saw before processing it into structured data.

Does `scrape_facebook` handle complex or nested social page structures? +

Yes, this tool is designed to enumerate attached structured rules specific to Facebook pages. It exports active social page content while mitigating the typical constraints found when scraping large-scale social media data.

When should I use the JavaScript (JS) Token versus the Normal Token? +

Use the Normal Token for fast, static HTML extraction. Switch to the JavaScript Token when the target site uses frameworks like React or Angular, where content is rendered dynamically in the browser. The 'scrape_js_rendered' tool requires the JS Token to function.

Can my agent bypass CAPTCHAs while scraping Google or LinkedIn? +

Yes. Crawlbase is built to handle CAPTCHAs and blocks natively. When you use specialized tools like 'scrape_google_serp' or 'scrape_linkedin', the agent routes your requests through Crawlbase's advanced proxy infrastructure to ensure successful data extraction.

How do I get a structured JSON response instead of raw HTML? +

Use the 'scrape_json_format' tool or the specialized scraper tools (Amazon, LinkedIn, etc.). These trigger Crawlbase's auto-extraction pipelines, which analyze the page structure and return specific data fields in a clean JSON format.

Connect to your AI in seconds.

Scrape html

Scrape js rendered

Scrape json format

Crawlbase: 10 Web Scraping Utilities

Make your AI actually useful.

Scrape Html

Scrape Js Rendered

Scrape Json Format

Get Screenshot Link

Scrape Amazon

Scrape Linkedin

Scrape Facebook

Scrape Google Serp

Scrape Twitter

Custom Scrape

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Copying and pasting data from complex websites is slow.

Web Scraping Utilities

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Competitor Price Monitoring

Talent Scouting

Deep Web Research

Search Engine Intelligence

The honest tradeoffs

Building a scraper from scratch

Ignoring anti-bot measures

Handling messy output

When It Fits, When It Doesn't

Questions you might have