Apify MCP for AI. Direct web data extraction via conversation.

Q: How do I list all available scrapers using Apify MCP?

You use the listactors tool. This shows you every scraper bot (Actor) you have access to, giving you their IDs so you know exactly what job they're built for.

Q: What is the difference between runactor and runactorsync?

The difference is timing. runactor starts a background process, which is best for long jobs because it returns immediately. runactorsync blocks your agent until the job finishes; use this only for very short tasks (under five minutes).

Q: How do I get the final data from an Apify run?

After a scrape is complete, use getdatasetitems. This pulls all the structured records and provides them to your agent in usable JSON format.

Q: Can I stop a running scrape with abortrun?

Yep. You can call abortrun anytime you need to halt a job, which is critical if the scraper starts pulling junk data or exceeds your budget.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

Apify MCP connects your AI agent directly to a full-stack web scraping platform. You can list available scrapers, run bots asynchronously or synchronously, and pull structured data records in raw JSON format, all through natural conversation.

What your AI can do

Abort run

Stops a running Apify scraper job immediately if the scrape is going off track or if enough data has already been collected.

Get account limits

Checks your current consumption and subscription limits to make sure you don't hit an overage charge.

Get dataset items

Exports the structured JSON data from a completed Apify dataset, supporting large bulk downloads by page limit.

+ 7 more capabilities included

Discover available scrapers

Lists every scraper bot (Actor) configured in your Apify account so you know what data is accessible.

Initiate scraping jobs

Starts a web scrape, either waiting for it to finish immediately or running it in the background for long-term monitoring.

Pull structured data

Retrieves the full dataset of scraped records as JSON objects after a job completes.

Control ongoing jobs

Allows you to stop runaway scrapes or tell an active scraper to crawl new URLs it finds.

Check system limits

Gives you a status report on your account's usage, including compute unit consumption and proxy bandwidth.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

Apify: 10 Tools for Web Data Operations

These tools give your agent the power to monitor, control, and retrieve data from every stage of web scraping—from listing bots to downloading final datasets.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Apify on Vinkius

Abort Run

Stops a running Apify scraper job immediately if the scrape is going off track or if enough data has already been collected.

Get Account Limits

Checks your current consumption and subscription limits to make sure you don't hit...

Get Dataset Items

Exports the structured JSON data from a completed Apify dataset, supporting large...

Get Key Value Store

Retrieves miscellaneous files related to a run, like screenshots or configuration...

Get Run

Checks the status and metadata of an active scrape job so you know if it's still...

List Actors

Shows all scraper bots available in your account, including their IDs and default settings for triggering a run.

List Webhooks

Lists the external systems that get notified when an actor run succeeds or fails.

Push To Queue

Tells a currently running scraper to add new URLs it discovers to its list of pages...

Run Actor Sync

Runs a short-lived scraper and makes your agent wait until it finishes before giving...

Run Actor

Starts an Apify scraper bot in the background using custom input settings, returning...

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Apify integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "apify": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Apify tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"apify": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Apify, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Apify. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 10 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Manually collecting web data is a grind.

Right now, if you need product listings from 50 different pages, you're sitting here clicking through tabs: copy the URL, paste it into your spreadsheet, manually adjust proxy settings, and then run the whole thing. It’s slow, prone to breakages, and frankly, exhausting.

With this MCP, that whole process disappears. You tell your agent what data fields you need, and the bot handles the entire traversal. The result? Structured JSON records appear instantly for review. That's a huge difference.

Apify MCP lets you manage web scraping runs.

Before this, if a scrape failed halfway through, your only option was to manually restart it and hope nothing broke. You had no easy way to tell the process, 'Hey, stop; we got enough data,' or 'Wait, crawl these 10 more specific links.'

Now you can control the entire lifecycle: start with `run_actor`, monitor status with `get_run`, and then dynamically feed it new URLs using `push_to_queue`. You're in charge of the bot.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

This connector lets you direct complex data extraction workflows entirely through chat. Instead of setting up API keys and running scripts locally, your agent talks to the Apify system—it finds the right scraper bot, runs it, monitors its progress, and pulls the resulting structured datasets into your context window. You can even tell the running process to crawl new pages you discover mid-job.

Because Vinkius hosts this MCP in the catalog, you just connect once from any compatible client, giving you immediate access to run these sophisticated web tasks without touching a command line.

Built · Hosted · Managed by Vinkius Apify-MCP - Web Scraping Automation & Data Extraction

Server ID 019d754f-450f-733b-ad5f-50479ed751b0

Vinkius Inspector

Compliance Grade A+

Score 98.33/100

Report View Report ↗

What Changes When You Connect

You bypass writing boilerplate Python. Instead of calling run_actor and then polling with get_run, you just ask your agent to check the status, making the whole process conversational.

Data retrieval is simple: Once the scrape finishes, use get_dataset_items to pull all structured data directly into the chat context. It handles massive JSON exports for you.

You maintain control over expensive jobs. If a scraper starts going wild or runs past its usefulness, you can hit 'stop' using abort_run, saving compute units and time.

The system is resilient because you don't have to manually manage state. The agent tracks the job ID from run_actor and knows when to query for results using get_dataset_items.

It helps with governance too. Before you start anything huge, check get_account_limits. You don't want a runaway scrape wiping out your budget because you forgot about it.

See it in action

01 01

Monitoring competitor pricing changes

A market researcher needs to track product prices daily. They use list_actors to find the correct price scraper, then use run_actor for a scheduled job. When the results arrive, they pass them through the agent and ask it to format the latest data into a summary table.

02 02

Crawling deeply linked product catalogs

An AI developer needs to scrape an entire website section that involves clicking 'next page' buttons. They use run_actor and then follow up by calling push_to_queue, telling the running process exactly which newly found URLs it must crawl.

03 03

Checking for data completeness

A data engineer runs a scrape and is worried about missing metadata. They use get_key_value_store to pull down any attached screenshots or configuration files from the job, ensuring they have all the necessary audit details.

04 04

Verifying service health after failure

A client runs a large scrape and it fails. They use get_run first to see the exact error status, then check list_webhooks to confirm if external systems were supposed to get notified about the failure.

The honest tradeoffs

Treating data as a single API call

Anti-pattern

Trying to pull all data by just calling get_dataset_items without thinking about pagination. You might get an error or, worse, only get the first 100 records.

The Fix

Always check your dataset needs and use the limit/offset parameters in get_dataset_items. If you need a massive pull, run it for a while, then use list_actors to confirm the correct bot is running.

Starting big jobs without checking limits

Anti-pattern

Launching an intensive scraper that consumes 50 Compute Units, only to find out later via billing that you were already at 90% capacity. Wasted cycles.

The Fix

Always start with get_account_limits. Check your compute unit usage and proxy bandwidth before launching any major scrape using run_actor.

Ignoring the job status

Anti-pattern

Triggering a complex multi-page crawl using run_actor and then assuming the data is ready immediately, leading to an empty dataset return.

The Fix

Immediately follow up the run_actor call with get_run. Use this tool repeatedly until the status shows 'SUCCEEDED' before attempting to retrieve data via get_dataset_items.

When It Fits, When It Doesn't

Use this MCP if your goal is extracting structured web content from diverse sources and you need full control over the scraping lifecycle. It’s perfect for iterative, complex tasks (like crawling a site using push_to_queue) that can't be solved by simple API calls.

Don't use it if: 1) You are only retrieving data from a known source database (use a dedicated database connector). 2) Your web scraping logic is extremely complex and requires custom, non-standard browser interactions (you might need to write the wrapper yourself).

If you just need to list available scrapers, use list_actors. If you only care about checking if your budget allows another scrape, stick with get_account_limits. The power of this MCP is in managing the entire state—from initiation to retrieval and cleanup.

Questions you might have

How do I list all available scrapers using Apify MCP? +

You use the list_actors tool. This shows you every scraper bot (Actor) you have access to, giving you their IDs so you know exactly what job they're built for.

What is the difference between run_actor and run_actor_sync? +

The difference is timing. run_actor starts a background process, which is best for long jobs because it returns immediately. run_actor_sync blocks your agent until the job finishes; use this only for very short tasks (under five minutes).

How do I get the final data from an Apify run? +

After a scrape is complete, use get_dataset_items. This pulls all the structured records and provides them to your agent in usable JSON format.

Can I stop a running scrape with abort_run? +

Yep. You can call abort_run anytime you need to halt a job, which is critical if the scraper starts pulling junk data or exceeds your budget.

How do I check my compute unit usage using get_account_limits? +

It immediately reports your current consumption against your subscription cap. This tool monitors both compute units and proxy bandwidth, helping you prevent unexpected overage charges on large scraping jobs.

What information does get_run provide about an active scraping job? +

The endpoint provides the run's current status, metadata, and consumption details. You can poll this tool to track if a long-running scrape is still running or has successfully completed.

When should I use get_key_value_store instead of getting dataset items? +

Use it for non-structured files like screenshots, configuration inputs, or raw HTML snapshots. The key-value store holds arbitrary binary and text data linked to a specific run ID.

How does list_webhooks help with automated workflows? +

This tool lists all configured webhooks, which enable external systems to react when an actor run succeeds or fails. It is essential for building reliable, event-driven architectures.

How can the AI agent run a scrape on a list of product URLs? +

First, find your specific scraping Actor ID via list_actors. Then, prompt your agent to execute run_actor, providing the target URLs formatted as a structured JSON input payload. It returns a 'Run ID'. You can poll this run via get_run, and once it succeeds, the agent calls get_dataset_items to pull all acquired data straight to your window.

Can the agent interact with run configurations mid-way during crawling? +

Yes. If an Apify crawler is currently executing and utilizes a Request Queue, you can instruct your agent to call push_to_queue. Doing so dynamically ships new URLs to the active queue instance, extending the current web crawl without needing to stop or restart the Actor.

Can my AI automatically detect scraping timeouts and debug the failure? +

Absolutely. Because your agent can track real execution flows with get_run, it's aware if it transitions to TIMED-OUT or FAILED states. Subsequently, you can ask the agent to examine the KV Store log outputs ensuring the underlying issue (e.g. captcha block, blocking proxy) is identified immediately.

View all recipes →

MCP Servers to Build AI Training Datasets

You need a dataset of 10,000 product listings for your RAG system but there is no API , Apify scrapes them, Chroma stores them as searchable embeddings, and Notion tracks every data source with quality scores

Apify Chroma Vector Db Notion

View all recipes

Connect to your AI in seconds.

Abort run

Get account limits

Get dataset items

Apify: 10 Tools for Web Data Operations

Make your AI actually useful.

Abort Run

Get Account Limits

Get Dataset Items

Get Key Value Store

Get Run

List Actors

List Webhooks

Push To Queue

Run Actor Sync

Run Actor

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Manually collecting web data is a grind.

Apify MCP lets you manage web scraping runs.

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Monitoring competitor pricing changes

Crawling deeply linked product catalogs

Checking for data completeness

Verifying service health after failure

The honest tradeoffs

Treating data as a single API call

Starting big jobs without checking limits

Ignoring the job status

When It Fits, When It Doesn't

Questions you might have

Powerful workflows you can unlock today

MCP Servers to Build AI Training Datasets