# DocRaptor MCP

> DocRaptor MCP converts raw HTML, web URLs, or structured data into high-fidelity PDF, XLS, and XLSX files. It uses the Prince XML engine for accurate CSS rendering, letting your AI agent generate professional documents without complex local setup. You can process large jobs asynchronously and create temporary hosted links for immediate sharing.

## Overview
- **Category:** productivity
- **Price:** Free
- **Tags:** pdf-generation, excel-export, html-to-pdf, princexml, document-automation

## Description

This MCP lets you use your AI client to turn messy web data or raw HTML strings into polished, print-ready documents. Whether you pull a report from a dashboard or scrape content from an article, the system accurately converts it into professional PDFs or fully structured Excel files. It handles complex styling, meaning headers, footers, and page numbers look exactly right every time, regardless of the source website's code. For huge jobs, you don't have to wait for a single output; you can kick off background processing and check back later on progress. Because Vinkius hosts this MCP in the catalog, connecting it is simple—you just connect your preferred AI client and start generating high-quality documents with a single command.

## Tools

### create_document
Generates a PDF or Excel document from HTML content, allowing you to set the output as hosted and asynchronous for large files.

### get_document_status
Checks whether an async document generation job has finished processing.

### list_documents
Fetches a list of all documents that have been generated by the service previously.

### list_ips
Retrieves the IP addresses used by DocRaptor for secure network verification and asset fetching.

## Prompt Examples

**Prompt:** 
```
Create a test PDF document from the URL https://example.com and make it hosted.
```

**Response:** 
```
I've initiated the document creation. Since it's a test, it will have a watermark. You can download your hosted PDF here: https://docraptor.com/download/example-link
```

**Prompt:** 
```
Check the status of my async document with ID 'status_12345'.
```

**Response:** 
```
The document generation is complete! It resulted in a 3-page PDF. You can download it using the provided URL.
```

**Prompt:** 
```
List the last 5 documents generated in my account.
```

**Response:** 
```
I've retrieved your recent documents. Here are the last 5: 'Invoice_Jan.pdf', 'Report_Q4.xlsx', 'Test_Layout.pdf', etc. Would you like details on any of these?
```

## Capabilities

### Generate Documents from HTML
Creates polished PDFs or Excel spreadsheets directly from raw HTML code or public URLs.

### Manage Background Jobs
Checks the status of large document generation jobs that run in the background, preventing timeouts and failures.

### Access Past Documents
Retrieves a list of all documents previously created through the MCP for management or auditing.

### Secure IP Discovery
Lists the specific IP addresses used by the service, which is useful for secure network integration and asset fetching.

## Use Cases

### Generating Quarterly Financial Reports
The operations team needs to convert the complex data from three different internal web dashboards into a single PDF report. They ask their agent to use `create_document` on the combined HTML, ensuring all charts and headers look professional and are ready for executive review.

### Automating Client Invoicing
A developer needs to generate dozens of invoices daily from a database feed. Instead of writing complex API calls for every file, they use `create_document` with the hosted option, getting immediate, temporary download links for each client.

### Archiving Web Content
A content analyst wants to archive a lengthy article from a public URL. They prompt their agent to run `create_document` on the URL and request an XLSX output. This captures not just the text, but any structured data tables found within the article.

### Handling Large Compliance Submissions
A compliance officer needs a 50-page document compiled from multiple sources. They use `create_document` with async enabled. The agent monitors progress using `get_document_status` until the entire massive file is ready for download.

## Benefits

- Guaranteed professional fidelity. Because this MCP uses the Prince XML engine, your generated reports retain complex CSS layouts, ensuring headers, footers, and page numbering never break—a huge win over basic PDF converters.
- Handles massive jobs without failure. Use asynchronous processing to kick off document generation for large files and then check back later with `get_document_status`, avoiding time limits in your AI client.
- Saves storage hassle. When you use the hosted option, the MCP creates temporary public download links for your finished PDFs or Excel sheets, so you don't need to manage file storage yourself.
- Versatile export options. You get more than just static PDFs; you can generate structured XLSX and XLS files, which is crucial when sending data that needs further manipulation in a spreadsheet program.
- Full audit trail capability. With `list_documents`, your agent can retrieve a history of every document created, giving you an easy way to manage versions and track generated reports.

## How It Works

The bottom line is that you tell your agent what content needs converting, and this MCP handles the complex rendering process to give you a clean file.

1. Subscribe to this MCP on Vinkius and enter your unique DocRaptor API key.
2. Send a prompt to your AI client asking it to create a document, providing the HTML or URL source material.
3. Wait for the generated result; if it's large, retrieve the final download link or check the status using subsequent tool calls.

## Frequently Asked Questions

**How does DocRaptor MCP handle huge files?**
It uses an asynchronous process. You initiate the job with `create_document` and then use `get_document_status` to poll for completion, which prevents time-out errors.

**Can I generate Excel sheets from a website URL using DocRaptor MCP?**
Yes. You can pass the public URL to `create_document`, and you have the option to output the result as an XLSX or XLS file, capturing structured data tables.

**Do I need to manage my own storage when using DocRaptor MCP?**
No. By enabling the hosted feature during creation, the MCP generates temporary, publicly accessible download links for your finished documents, so you don't manage any file storage.

**How do I verify network access when using DocRaptor MCP?**
You can call `list_ips` to retrieve the current IP addresses used by the service. This is useful for secure applications that require whitelisting specific endpoints.

**What if my document generation fails? How do I check the error?**
After an issue, you should use `list_documents` to check the history of attempts and then confirm the status using `get_document_status` with the relevant ID.