# Apify MCP

> Apify manages web scraping actors, collecting structured data at scale. This MCP lets your agent list available scrapers, trigger runs with custom inputs, and monitor execution health. You get access to all resulting datasets—you can then fetch specific records for immediate analysis or check the full history of any single run.

## Overview
- **Category:** friends-mcp
- **Price:** Free
- **Tags:** data-extraction, serverless-actors, web-automation, data-scraping, api-integration, structured-data

## Description

Managing web scraping used to mean juggling dozens of tabs, running complicated scripts locally, and praying your data structure remained consistent. Now, you just tell your agent what needs collecting. This MCP connects your entire workflow to Apify's platform, letting you manage the whole process through natural conversation.

Need to scrape a list of product reviews? Your agent can first check which scrapers are available and then trigger a run with specific parameters. Once that job finishes, instead of hunting for files, you simply ask your agent to retrieve the resulting data from the dataset. You get clean records—profile URLs, captions, engagement counts—ready for immediate analysis. If something goes wrong, you don't have to start over; you can query the run history and debug exactly where it failed. This centralized control makes managing large-scale web automation simple, all managed through Vinkius.

## Tools

### get_dataset_results
Fetches the actual record items from an existing dataset after a successful scrape.

### get_run_details
Retrieves specific metadata and logs for one particular scraping execution run.

### list_actors
Retrieves a list of every available scraper actor in your Apify account.

### list_datasets
Shows all the datasets you have collected data into within your account.

### list_actor_runs
Lists the status, history, and identifiers of recent scraper executions.

### list_actor_tasks
Shows all saved task configurations so you can reuse complex inputs for new runs.

### run_actor
Initiates a new web scraping run for a specific actor using defined inputs.

## Prompt Examples

**Prompt:** 
```
List all actors in my Apify account.
```

**Response:** 
```
I've retrieved your actors. You have 3 active scrapers including 'Instagram Scraper', 'Google Maps Reviewer', and 'Custom CRM Sync'. Which one would you like to run or check datasets for?
```

**Prompt:** 
```
Run the 'Instagram Scraper' with input { "hashtags": ["#AI"] }.
```

**Response:** 
```
Actor run triggered! I've started the 'Instagram Scraper' (ID: act_10293) with your custom input. The execution ID is run_88231. I'll monitor it and let you know when the dataset is ready.
```

**Prompt:** 
```
Show me the results from dataset 'ds_10293'.
```

**Response:** 
```
I've fetched the dataset items. There are 25 records including profile URLs, post captions, and engagement counts. Would you like me to summarize the top performing posts?
```

## Capabilities

### Discover available scrapers
List every configured scraper actor in your account so you know what data sources exist.

### Trigger web scraping jobs
Start a new extraction run for an actor, passing custom inputs like hashtags or URLs to customize the scrape.

### Retrieve structured records
Fetch specific data items from a completed dataset so you can analyze them immediately.

### Check run status and history
View the detailed log or status of any past actor execution to debug failures or check completion times.

### Manage saved scraper settings
List configured task presets, letting you reuse complex scraping inputs without re-entering them.

## Use Cases

### Tracking competitor product changes
A marketing analyst needs to monitor price changes for 50 items daily. They first use `list_actors` to find the correct scraper, then run it with a list of URLs using `run_actor`. Finally, they ask the agent to pull all the updated prices from `get_dataset_results`, giving them an immediate comparison report.

### Debugging unreliable data collection
An engineering team runs a scraper but gets incomplete records. They use `list_actor_runs` to find the run ID, then call `get_run_details`. The detailed logs reveal the failure point was due to missing required headers, allowing them to fix the actor configuration immediately.

### Building a repeatable research workflow
A researcher needs to scrape data for three different geographical areas. Instead of re-entering location parameters each time, they use `list_actor_tasks` to pull the saved templates and then run the actor three times with minimal manual input.

### Quickly assessing dataset scope
A product manager just finished a large scrape and needs to know how much data was collected. They use `list_datasets` first, then tell the agent to run `get_dataset_results` on that specific dataset ID to confirm record counts and field types.

## Benefits

- Skip the setup time. Instead of manually entering scraping parameters every time, you can use `list_actor_tasks` to pull saved configurations and reuse them for new jobs.
- Stop guessing if a scrape worked. By checking run history with `list_actor_runs`, you instantly see the status (success or failure) without diving into complex dashboards.
- Analyze data immediately. Using `get_dataset_results` lets your agent pull clean records, which you can then ask it to summarize or categorize right away.
- Isolate problems quickly. If a scrape fails, `get_run_details` gives you the exact logs and metadata needed to pinpoint whether the failure was due to an input issue or an external website change.
- Maintain visibility. You don't need to remember which data lives where; simply use `list_datasets` to get a clean overview of all your collected information.

## How It Works

The bottom line is that your agent handles the entire lifecycle: setup, execution, and retrieval—all within a single conversation.

1. Subscribe to this MCP and provide your Apify API Token.
2. Use the agent to find or list a specific actor (scraper) you want to use.
3. Tell the agent to trigger the run, providing any necessary inputs. After completion, ask it to retrieve the final structured data from the resulting dataset.

## Frequently Asked Questions

**How do I see what scrapers are available using list_actors?**
You call `list_actors` to get a full roster of every scraper you've set up. This tells you which data sources your agent can access for scraping.

**What is the difference between list_datasets and list_actors?**
This matters: `list_actors` shows the tools (the scrapers) you use to collect data. `list_datasets` shows where the collected results are stored after a scrape completes.

**How can I check if my scraper run actually finished?**
You need to use `list_actor_runs`. This tool gives you the history and status of every job, letting you confirm when it's safe to pull data.

**I want to analyze data from a failed scrape. Do I use get_run_details?**
Yes, `get_run_details` is the right tool. It pulls specific logs and metadata for that run, telling you *why* it failed, which is more useful than just seeing 'failed' in a list.

**Can I reuse scrape settings?**
Totally. Use `list_actor_tasks` to view saved inputs and then pass those task names when calling `run_actor`, saving you from re-entering parameters.

**When I use `get_dataset_results`, how do I filter for specific record types or dates?**
You pass filters directly with your request. You specify criteria—like date ranges or field values—in the query parameters. This keeps your result set clean and focused only on the data you need.

**When I use `run_actor`, what format must my input data be in?**
You must provide inputs as structured JSON objects that match the actor's required schema. Passing simple text usually fails; proper JSON ensures the scraper gets exactly the parameters it needs.

**Can I use `list_actor_runs` to check an actor’s historical performance or reliability?**
Yes, you can query run summaries and status codes. This lets you track average completion time or spot patterns of slow execution across many runs. It's useful for planning capacity.

**Can I provide input parameters when running an actor?**
Yes! Use the `run_actor` tool and provide the optional `input` JSON object to configure specific scraper settings for that run.

**How do I see the items collected in a dataset?**
Run the `get_dataset_results` query with your Dataset ID. The agent will retrieve the data records, which you can then ask the AI to summarize or analyze.

**Is it possible to check the status of a specific actor run?**
Absolutely. Use the `get_run_details` tool and provide the Run ID. Your agent will retrieve the status (RUNNING, SUCCEEDED, FAILED) and metadata for that specific execution.