ParseHub MCP. Control Web Scraping Runs via Chat Conversation
ParseHub connects advanced cloud scraping jobs directly into your AI workflow. List configured projects, dispatch headless runs, check crawler status in real time, and pull structured datasets via chat commands. Stop managing web scrapers through separate dashboards; control complex data collection right where you write.
Give Claude and any AI agent real-world access
View every web scraping project saved in your account, including their unique tokens and template details.
Tell the MCP to trigger a new headless scrape job for any specified project.
Start a scraping run that focuses on specific pages, bypassing the default starting URL for a project.
Get real-time updates on whether a scheduled scrape is queued, running, or if it has completed successfully.
Retrieve the final structured JSON data from any completed scraping run for immediate use.
Ask an AI about this
Waiting for input…
What AI agents can do with ParseHub with 10 Tools
These tools let you manage the entire lifecycle of web scraping: listing projects, starting runs, tracking progress, and retrieving final, clean data payloads.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using ParseHub MCPCancel Run
Stops a running or queued scrape job to free up cloud resources and prevent unnecessary charges.
Delete Run
Permanently removes old scraping run history and associated data, helping you clean...
Get Project
Retrieves the full configuration details for a specific web scraping project token.
Get Run Data
Downloads the final, structured JSON payload from a run only after it has been...
Get Run Details
Checks the current status of a specific scrape job to determine if it's waiting in...
Get Last Ready Data
Immediately fetches the latest completed data for a project without needing to track individual run tokens first.
List Projects
Lists all available web scraping projects in your account, providing unique tokens and status information.
List Runs
Provides a historical record of every run for a project, useful for auditing or...
Run Project
Initiates a new scrape job using the default start URL and template configured in an...
Run Project With Url
Starts a scraping run targeting a specific, custom web address while maintaining all...
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on each call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with ParseHub, then connect any of our 5,200+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,200+ others, all in one place
- Add new capabilities to your AI anytime you want
- Connections are secured and governed automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog weekly
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by ParseHub. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS CLOUD
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on each call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
The Grind: Web Scraping Used To Be a Juggling Act
Today, getting structured data from a website means opening the scraping dashboard in one tab. You have to manually configure the starting URL, hit run, then switch tabs every ten minutes to check if it's still running. If you need to change the target site, you restart the whole process and repeat those clicks.
With this MCP, your agent handles the entire cycle. You tell it what you need, and it manages the complex headless browser automation in the background. The result? Clean, structured data payloads appear directly for your agent to use—no dashboard refreshing required.
ParseHub: Structured Data Extraction Via Conversation
Manual steps that disappear include navigating project tokens, monitoring status codes across different UIs, and manually downloading ZIP files just to get a JSON array. These are all abstracted away.
Now you simply command the action. You use `run_project` to kick off the job, check it with `get_run_details`, and when ready, pull the exact data using `get_run_data`. It's one simple conversation.
What ParseHub MCP does for your AI
Web scraping used to mean logging into a dedicated dashboard, setting up parameters, hitting 'run,' then waiting for emails or refreshing pages until the data finally appeared. Now, you can manage that entire process inside your chat agent. This MCP lets you treat web crawling like any other function call.
You can list all your existing projects—including their start URLs and templates. Need new data? Just dispatch a run job on command, specifying which project to use or even overriding the default starting URL. The system tracks everything, telling you if the job is queued or running. When it’s done, you don't just get a 'Success' message; you pull down secure, structured JSON arrays containing all the scraped payloads, ready for your agent to process.
019d75ef-66f9-73b1-9183-e89e904e7d83 How to set up ParseHub MCP
The bottom line is, your agent handles the entire sequence: setup, execution, monitoring, and final data retrieval from a single conversation thread.
Subscribe to this MCP and provide your ParseHub API key.
Ask your agent to list available projects, or specify a custom URL, so it can identify the correct job parameters.
Once you confirm the run details, the MCP executes the scrape. You then use subsequent commands to track status until the data is ready for extraction.
Who uses ParseHub MCP
Data Engineers who are tired of switching between scraping tools and their primary workflow. Research Analysts needing to run academic paper extractors without manual dashboard interaction. Marketing Intelligence pros who need immediate competitor pricing data.
Trigger cloud scraping logic securely, then pipe the extracted JSON datasets directly into a subsequent processing tool or database connection.
Kick off academic paper extractors via chat and wait for confirmation before having the agent digest the resulting structured data.
Fetch completed scrapers tracking competitor pricing logic, ensuring that JSON arrays are immediately available for comparative analysis.
Benefits of connecting ParseHub MCP
You don't have to switch between the ParseHub dashboard and your agent. You trigger, monitor, and retrieve data—all within one chat session.
Need fresh data fast? Use get_last_ready_data to grab the absolute latest payload without having to track a specific run token first.
When you need to scrape different pages using the same template (like product categories), use run_project_with_url. It changes only the start page, not your extraction rules.
The system keeps track of everything. Use get_run_details to check if a job is queued or running without needing to refresh an external web app.
You can clean up old jobs and manage costs by using tools like cancel_run or permanently removing data with delete_run.
ParseHub MCP use cases
Monitoring Competitor Pricing Changes
A market analyst needs to know if a competitor changed its pricing structure. They ask the agent to run an extractor on the main product page, wait for get_run_details to confirm completion, and then use get_run_data to pull the structured JSON of all price points.
Processing a Batch of Articles
A research team has 50 articles on different websites. Instead of running 50 jobs manually, they ask the agent to use run_project_with_url for each unique URL, then collect all the resulting structured data into one payload.
Auditing Historical Scrapes
A data engineer needs proof of what was scraped last month. They ask the agent to list_runs, find a specific run ID, and confirm its contents using get_run_data before moving on.
Stopping an Overdue Job
A job gets stuck in an infinite loop. The user uses the agent to check the status via get_run_details, determines it's stalled, and immediately calls cancel_run to free up resources.
ParseHub MCP tradeoffs
What to watch out for, and the recommended way to handle each one.
Assuming data is ready.
The user asks the agent for the final JSON payload right after running a job. The agent fails because the run status is still 'queued' or 'running', and get_run_data cannot be called yet.
Always check the progress first. Use get_run_details to monitor the job until the system confirms it is complete. Only then should you use get_run_data.
Ignoring project scope.
The user wants to scrape data from a new site but uses the default run command, which only targets the original project's starting URL and template.
If you need to target a completely different page or set of pages while keeping the same scraping rules, use run_project_with_url. This overrides the default start address.
Overwriting data accidentally.
A user repeatedly runs jobs without cleaning up old results, leading to a massive storage quota bill and confusion about which data is current.
Use list_runs first to identify the specific historical run you need. When finished with an old job, use delete_run to permanently free up that stored payload.
When to use ParseHub MCP
Use this MCP if your primary goal is automated, multi-step web data extraction and structured JSON output. You're working with content on the public internet—like product pages, competitor sites, or academic journals—and you need to run complex, headless browser scraping jobs without leaving your AI chat interface. This is a full lifecycle tool: it lets you list projects, manage runs, check status (get_run_details), and finally pull the data (get_run_data).
Don't use this if:
1. You are extracting data from a database (use a dedicated SQL/NoSQL connector).
2. You just need to send a simple message or write text (use a messaging MCP).
3. You only need to validate the format of data you already have in memory. For pure schema validation, use a type-safe tool like Pydantic AI instead.
Frequently asked questions about ParseHub MCP
How do I start a scrape if I want to use different pages? +
You use the run_project_with_url tool. This lets you target custom URLs while keeping all of your project's original scraping rules and template definitions intact.
Can ParseHub MCP list what projects I already have? +
Yes, use the list_projects tool. It shows every web scraping project you’ve set up, giving you the unique tokens needed for subsequent commands.
What if my scrape job fails? Can I stop it? +
You can monitor the status using get_run_details. If it's stalled or taking too long, use the cancel_run tool to safely stop the operation and free up resources.
How do I get data from a run that finished yesterday? +
First, you should list_runs to find the specific ID. Once you have the ID for a completed job, use get_run_data to pull down the structured JSON payload.
Do I need an API key for ParseHub MCP? +
Yep. You must subscribe and provide your ParseHub API Key during setup so the agent can authenticate and manage cloud scraping jobs on your behalf.