# WebScrapingAPI MCP

> WebScrapingAPI delivers industrial-grade web data extraction directly through your AI client. Scrape raw HTML from any URL using datacenter proxies. Capture complex JavaScript rendering by running dynamic pages through a headless browser. Pull structured results, like product prices and search engine snippets (Google, Bing, Yandex), all via natural conversation.

## Overview
- **Category:** developer-tools
- **Price:** Free
- **Tags:** html-parsing, proxy-rotation, javascript-rendering, data-extraction, serp-data, headless-browser

## Description

Stop copy-pasting data from websites into spreadsheets. This MCP lets you treat the live internet as a database. You talk to your agent—whether it's in Cursor or Claude—and tell it what you need: 'Give me the price and review count for this product on Walmart.' It handles the messy work of navigating dynamic sites, bypassing anti-bot measures with residential proxies, and structuring the data automatically.

It doesn't matter if a site loads its content using JavaScript or if you just need basic raw HTML. You simply ask your agent to scrape it, and it gets the full rendered state. Need competitive pricing? Ask for product details from major e-commerce sites like Amazon into clean JSON format. Want to know what people are searching for? It gathers structured search results across Google, Bing, and Yandex.

Connecting this MCP through Vinkius means you get access to all these capabilities—from basic scraping to advanced data parsing—all in one place. Your AI client becomes a web architect, gathering massive datasets or verifying competitor pricing without you ever writing a single line of scraper code.

## Tools

### custom_api_scrape
Execute a scrape with advanced options like geo-targeting, specific sessions, or custom headers.

### scrape_and_auto_extract
Scrape news or product pages and automatically extract structured data from the content.

### scrape_ecommerce_product
Extracts price, title, and reviews as clean JSON format from Amazon, Walmart, or similar stores.

### scrape_static_html
Retrieves the raw HTML structure from any target URL using datacenter proxies.

### scrape_js_rendered
Scrapes dynamic pages by simulating a full headless browser render, capturing all JavaScript-generated elements.

### scrape_as_mobile
Runs the scrape pretending it's coming from a mobile device to mimic real user access patterns.

### scrape_via_residential_proxy
Uses residential proxies for high anonymity, helping bypass aggressive bot detection systems.

### search_bing_serp
Retrieves structured search results data specifically from Bing.

### search_google_serp
Gets structured search engine results from Google based on a provided query string.

### search_yandex_serp
Retrieves structured search engine results data specifically from Yandex.

## Prompt Examples

**Prompt:** 
```
Scrape the rendered HTML of 'https://example.com/dynamic-dashboard'.
```

**Response:** 
```
I've scraped the dynamic page using a headless browser. Here is the rendered content, including all JavaScript-generated elements and data points from the dashboard.
```

**Prompt:** 
```
Search Google for 'best wireless noise cancelling headphones' and return structured results.
```

**Response:** 
```
I found the top Google SERP results for your query. 1. 'Sony WH-1000XM5 review' (https://...), 2. 'Bose QuietComfort Ultra' (https://...), 3. 'Apple AirPods Max' (https://...). Would you like to see the snippets or ads associated with these?
```

**Prompt:** 
```
Get the price and rating for the product at 'https://amazon.com/dp/B09XXX'.
```

**Response:** 
```
Successfully extracted product data from Amazon: Name: 'AcousticPro Wireless', Price: '$299.00', Rating: '4.8/5 stars (1,250 reviews)'. Would you like the full JSON extract?
```

## Capabilities

### Extract Product Data
Pull structured details like price, title, and reviews from major e-commerce sites into clean JSON.

### Handle Dynamic Pages
Capture the full content of modern websites that rely on JavaScript to load their data using a headless browser.

### Scrape Raw Content
Retrieve basic HTML structure from any target URL using high-capacity datacenter proxies.

### Get Search Results
Fetch structured results, including organic listings and ads, from Google, Bing, or Yandex search engines.

### Customize Scrape Parameters
Run scrapes with advanced targeting options like specific geographical locations, session management, or custom headers.

## Use Cases

### Monitoring Competitor Pricing
An e-commerce manager wants to know if three rivals changed their price on Amazon. They ask their agent to run `scrape_ecommerce_product` against several competitor URLs, getting a clean JSON object for each one instantly.

### Analyzing Search Trends
An SEO specialist needs to see how 'best running shoes' ranks across different search engines. They ask their agent to run `search_google_serp` and then follow up with `search_bing_serp` to compare the structured results.

### Gathering Market Research Data
A data scientist needs a dataset of news articles. They tell their agent to run `scrape_and_auto_extract` on 50 different news links, getting clean, usable content without manual parsing.

### Testing Dynamic Web Applications
A developer needs to verify if a complex dashboard page works correctly. They use the agent and request `scrape_js_rendered` on the test URL, confirming all elements are captured.

## Benefits

- Stop worrying about JavaScript. If a website loads its content dynamically, the `scrape_js_rendered` tool uses a headless browser to capture every element, ensuring you get complete data.
- Never hit a block again. Use residential proxies via the `scrape_via_residential_proxy` tool for high anonymity and reliable access that bypasses aggressive bot detection.
- Go beyond simple HTML dumps. The `scrape_ecommerce_product` tool automatically extracts key product details like price, title, and ratings into structured JSON format.
- Get comprehensive search data from multiple engines. You can compare results by running `search_google_serp`, `search_bing_serp`, or `search_yandex_serp` against the same query.
- Save time on complex requests with advanced options. The `custom_api_scrape` tool lets you specify geo-targets, sessions, and headers to fine-tune every scrape exactly how you need it.

## How It Works

The bottom line is you talk to your agent naturally, telling it what data you need, and the MCP handles the complex process of accessing, rendering, and structuring that information for you.

1. Subscribe to the WebScrapingAPI MCP and input your unique API key into Vinkius.
2. Activate the connector in your AI client (e.g., Cursor).
3. Ask your agent a question like, 'Find all product details from this Amazon link,' and it runs the scrape.

## Frequently Asked Questions

**Can WebScrapingAPI handle JavaScript-rendered websites?**
Yes, absolutely. Use the `scrape_js_rendered` tool. This function simulates a full browser to ensure that all content loaded by JavaScript is captured in your scrape.

**How do I get structured search results using WebScrapingAPI?**
You use the dedicated SERP tools like `search_google_serp` or `search_bing_serp`. These functions pull clean, structured data on ads and organic listings, not just a text summary.

**Is WebScrapingAPI good for scraping e-commerce sites?**
It’s excellent. The `scrape_ecommerce_product` tool is designed specifically to pull critical product details, like price and review scores, into easily usable JSON format.

**What if my scrapes get blocked by the site?**
Use `scrape_via_residential_proxy`. This feature cycles through real user IP addresses, giving you high anonymity that helps bypass aggressive bot detection systems.

**Does WebScrapingAPI support different countries/regions?**
Yes. You can use `custom_api_scrape` to execute scrapes with specific geo-targeting options, allowing you to monitor data from anywhere in the world.