# Microlink MCP

> Microlink lets your AI agent treat the web like a database. It extracts metadata, captures high-fidelity screenshots, generates clean PDFs from any URL, and runs technical audits against websites. This server turns unstructured web content into programmable data points for your workflow.

## Overview
- **Category:** developer-tools
- **Price:** Free
- **Tags:** metadata-extraction, screenshot-api, pdf-generation, lighthouse-reports, web-data

## Description

You'll use this server to make your AI client treat any website like it’s a database. It turns unstructured web content into data points you can actually program against. You run the agent against a URL, and it pulls out everything from clean reports to complex scripts.

When you need structured information, `extract_metadata` fetches specific data fields from any link you provide. It grabs things like the page's normalized title, its description, social media OpenGraph images, and author details. You can pull these distinct pieces of web data into your workflow without having to scrape anything yourself.

Need a visual record? `take_screenshot` captures an image of the website for you. You don’t just get a simple JPEG; you tell it if you want a full-page view or if you wanna simulate browsing on a specific device, like an iPhone screen. This lets you capture assets that look exactly how they'll appear to a user.

For formal reports, `generate_pdf` takes the URL and spits out a formatted PDF document. You can specify custom scaling and margins for this output, making it ready to archive or present immediately. It handles the messy web layout and gives you a clean file instead.

To check a site's health, use `get_insights`. This tool runs technical audits against a URL, providing performance scores based on Lighthouse standards. It also identifies all the underlying technology stack used by the website, so you know exactly what’s powering the page.

For deep automation, `advanced_query` lets your agent run highly controlled API queries directly against a URL. This is where you go when standard fetching isn't enough. You can customize every parameter for advanced scripting tasks, implementing things like ad-blocking rules or running custom JavaScript scripts on the live page. It gives you full control over how your agent interacts with the site’s content.

Your AI client handles these steps: it takes a URL as input and uses one of the five tools—`extract_metadata`, `take_screenshot`, `generate_pdf`, `get_insights`, or `advanced_query`—to execute the specific function. It doesn't guess; it performs the exact action you tell it to perform on that site.

You don't write code for the scraping; you just tell your agent which data point you need, and it handles the connection via Vinkius. You send a single request, and this server returns structured JSON data, an image file, or a ready-to-print PDF, depending on which tool you call.

## Tools

### advanced_query
Runs a highly controlled API query against a URL, allowing full parameter customization for advanced scripting tasks.

### extract_metadata
Retrieves structured data points—like titles, descriptions, and OpenGraph images—from any specified web address.

### generate_pdf
Creates a formatted PDF document from the content of a given website URL, supporting custom scaling and margins.

### get_insights
Performs technical audits on a URL, providing performance scores (Lighthouse) and identifying underlying web technologies.

### take_screenshot
Captures an image of the website. You can select full-page views or emulate specific mobile devices.

## Prompt Examples

**Prompt:** 
```
Extract the metadata and social image from https://microlink.io
```

**Response:** 
```
I've extracted the metadata for Microlink. The title is 'Microlink — Enter the browser as a service', and I've found the primary OpenGraph image and description for you.
```

**Prompt:** 
```
Take a full-page screenshot of https://github.com emulating an iPhone 15.
```

**Response:** 
```
I've generated a full-page screenshot of GitHub using an iPhone 15 viewport. You can view the mobile-optimized layout in the generated image.
```

**Prompt:** 
```
Run a Lighthouse performance audit for https://vercel.com
```

**Response:** 
```
The Lighthouse report for Vercel is ready. It shows high scores in Performance and Best Practices, with specific metrics for First Contentful Paint and Speed Index.
```

## Capabilities

### Extracting Structured Metadata
The agent fetches and formats specific web data—titles, descriptions, social media images, and author details—from any provided link.

### Capturing Visual Assets
You take a screenshot of the website. You can specify if you want the full page view or an emulation of a specific device (like an iPhone).

### Generating Formal Documents
The agent takes a web URL and produces a clean, formatted PDF document suitable for reports or archives.

### Running Technical Audits
You run `get_insights` to identify the underlying technology stack and generate Lighthouse performance scores for a given site.

### Executing Complex Queries
The agent performs advanced browser actions using `advanced_query`, allowing you to implement ad-blocking or custom JavaScript scripts on a URL.

## Use Cases

### Debugging Site Performance
A developer needs to check if the new marketing landing page loads correctly on mobile devices but is slow on desktop. They ask their agent to first run `get_insights` for performance metrics, and then use `take_screenshot` specifically emulating a phone viewport so they can visually confirm the layout.

### Competitive Content Gathering
A content strategist needs metadata from 50 competitor links. Instead of visiting each page, they tell their agent to loop through the URLs and call `extract_metadata` for every single one, gathering standardized titles and descriptions into a spreadsheet.

### Automating Quarterly Reports
A data analyst has 10 quarterly report links. They instruct their agent to run `generate_pdf` on each URL. The agent compiles all 10 PDFs into one zipped folder for easy distribution.

### Testing Complex Web Interactions
A developer needs to test a feature that only appears after an ad blocker is active. They use `advanced_query` with the ad-blocking parameter set, bypassing standard browser limitations and confirming the correct DOM elements are available.

## Benefits

- **Predictable Data:** Stop scraping messy HTML. Use `extract_metadata` to get clean, structured fields (like OpenGraph titles) that your agent can immediately use in code.
- **Multi-Format Output:** Don't just get text. Need a report? Run `generate_pdf` for professional documents, or use `take_screenshot` when the visual context matters most.
- **Deep Technical Auditing:** Go beyond basic SEO checks. The `get_insights` tool gives you Lighthouse scores and tells you exactly what technologies are running on the target site.
- **Device Emulation:** You don't have to visit three different mobile view links. Use `take_screenshot` to capture a full-page visual of a site emulating an iPhone or Android viewport directly from your agent.
- **Scriptable Power:** When simple extraction fails, use `advanced_query`. This tool gives you the raw control—ad-blocking, custom scripts—to interact with any web page.
- **Workflow Efficiency:** By having all these tools in one place, you can build a single sequence: Query -> Extract Data -> Audit -> PDF Report. It keeps your workflow contained and fast.

## How It Works

The bottom line is: you use your AI client to call a specific function with a URL, and Microlink handles all the web processing behind the scenes.

1. Subscribe to the Microlink MCP Server and provide your API key (if needed).
2. Direct your AI client to pass a target URL and an explicit request (e.g., 'Run `get_insights` on this link').
3. The server executes the tool, returning structured data, image files, or PDFs directly to your agent.

## Frequently Asked Questions

**How do I get structured metadata using extract_metadata?**
You simply pass the target URL to `extract_metadata`. The server returns normalized JSON containing titles, descriptions, and OpenGraph images. No parsing is required from your end.

**Can I take a screenshot of specific parts using take_screenshot?**
Yes. When calling `take_screenshot`, you can specify the exact CSS selector or DOM element ID you want to capture, rather than just the full page view.

**What is the difference between get_insights and advanced_query?**
`get_insights` provides standardized reports—the Lighthouse scores and tech stack info. `advanced_query` gives you raw control, letting you run custom scripts or implement ad-blocking for testing.

**How do I make a PDF report using generate_pdf?**
You pass the URL to `generate_pdf`. You can also specify formatting parameters like margins and scaling within the call, ensuring the final document meets your reporting standards.

**What should I use if my calls to `advanced_query` exceed the free usage tier?**
You need an active API key for high-volume processing. If you hit a rate limit, your AI client will receive a specific HTTP 429 error code. This indicates you must wait or upgrade your account access.

**How does `take_screenshot` handle multiple device viewports in one request?**
The tool processes one viewport per call, even if you are emulating a device. To get different views (e.g., iPhone and desktop), you must make separate API requests for each desired screen size.

**What is the maximum amount of content I can reliably include when running `generate_pdf`?**
While there isn't a strict character limit, complex or extremely long URLs may cause formatting errors. We recommend breaking up massive documents into smaller sections for guaranteed success.

**If the target URL is inaccessible when running `extract_metadata`, what error message do I receive?**
The system returns a standardized '403 Forbidden' or similar HTTP status code. This tells your AI agent immediately that the issue is access-related, not an API malfunction.

**Can I take a screenshot of a specific element on a page instead of the whole screen?**
Yes! Use the `take_screenshot` tool and provide a CSS selector in the `element` parameter. The agent will return a precise capture of just that component.

**How do I see what software or frameworks a website is built with?**
Use the `get_insights` tool with `technologies: true`. It uses Wappalyzer to identify the CMS, analytics, web servers, and JavaScript frameworks used by the URL.

**Can I generate a PDF in a specific paper format like A4?**
Absolutely. Use the `generate_pdf` tool and specify 'A4' in the `format` parameter. You can also adjust margins and orientation (landscape/portrait).