# Pdfcrowd MCP

> Pdfcrowd converts web content, PDFs, and raw data formats instantly. Send an HTML string or URL, and get a high-quality PDF, PNG image, or editable text back. Use `generate_business_document` to build professional invoices and receipts from simple JSON records.

## Overview
- **Category:** productivity
- **Price:** Free
- **Tags:** pdf-generation, html-to-pdf, web-to-pdf, invoice-generator, document-conversion

## Description

You gotta handle documents all the time, right? Web pages, PDFs, spreadsheets—they're always in some messy format. This server lets your agent punch through that mess. You don't just convert files; you turn static content into usable data streams.

**Converting Web Content to Visual Assets:** When you send a URL or raw HTML string, the `convert_html_to_pdf` tool spits out a high-quality PDF file, keeping all your formatting intact. If you need it as a picture instead of a document, use `convert_html_to_image`; that captures the whole thing as a base64 encoded PNG or JPG screenshot. This works for any web page or piece of raw HTML code you throw at it.

**Deconstructing PDFs into Code-Ready Text:** You got an old PDF sitting there? The server can read it and give you two options. If you need the content editable in code, use `convert_pdf_to_html` to turn the uploaded file back into clean, workable HTML code (base64 encoded data). If you just want the plain words for quick analysis—no tables, no formatting junk—you run `convert_pdf_to_text`. This strips everything down and gives you only pure, clean text.

**Generating Structured Financial Documents:** Need to make an invoice or a receipt? You don't have to worry about templates. Just feed the server structured JSON data payloads into `generate_business_document`. It takes your line items, totals, and customer info and automatically builds out a professional PDF document for you, all base64 encoded.

Your agent handles this whole process: it calls the right tool—say, `convert_pdf_to_text`—you pass the required input (the URL, the HTML string, or the JSON payload), and the server runs the conversion. It returns base64 encoded data, whether that's a PDF file, an image, or clean text. You get exactly what you need, ready for your client to use immediately.

## Tools

### convert_html_to_image
Captures a screenshot of a given URL or raw HTML string and returns it as base64 encoded image data (PNG/JPG).

### convert_html_to_pdf
Converts a specified web page URL or raw HTML string directly into a high-quality PDF file, returning the content as base64.

### convert_pdf_to_html
Takes an uploaded PDF document and transforms its contents back into editable HTML code, returning it as base64 encoded data.

### convert_pdf_to_text
Extracts all visible text from a PDF file and returns only plain, clean text data (base64 encoded).

### generate_business_document
Accepts structured JSON data payloads to automatically generate professional documents like invoices or receipts, returning them as base64 encoded PDFs.

## Prompt Examples

**Prompt:** 
```
Convert the web page https://example.com to a PDF in landscape orientation.
```

**Response:** 
```
I've initiated the conversion for https://example.com. I've set the orientation to landscape as requested. Your PDF is ready for download.
```

**Prompt:** 
```
Take a PNG screenshot of https://news.ycombinator.com with a width of 1280px.
```

**Response:** 
```
Capturing the screenshot... I've generated a PNG image of Hacker News at a 1280px viewport width. You can view the image now.
```

**Prompt:** 
```
Generate a modern invoice for $1200 USD for 'Software Consulting' with 1 item.
```

**Response:** 
```
I'm generating your modern invoice. I've added 'Software Consulting' as the line item for a total of $1200 USD. The PDF document has been created.
```

## Capabilities

### Convert Web Content to Visual Assets
Send any URL or raw HTML string; the server outputs a base64 encoded PDF or PNG/JPG screenshot.

### Deconstruct PDFs into Code-Ready Text
The server reads a PDF and returns the content either as clean, editable HTML or as plain text data.

### Generate Structured Financial Documents
Input structured JSON records (like line items and totals), and the server builds a professional PDF invoice or receipt.

## Use Cases

### Creating Quarterly Client Reports
Problem: The marketing team needs to send a quarterly report that includes web page screenshots and full PDF appendices. They can't manually stitch these together. Solution: Your agent calls `convert_html_to_image` for key visuals, then uses `convert_html_to_pdf` for the main narrative flow. It stitches everything into one package.

### Processing Submitted Invoices
Problem: A finance bot receives a batch of raw JSON records from an internal system and needs to generate customer invoices immediately. Solution: The agent passes the structured data directly to `generate_business_document`. It bypasses manual PDF creation entirely, outputting perfect PDFs.

### Extracting Data from Scanned Forms
Problem: You get a scanned document (a PDF) and need just the names and dates for database entry. Solution: The agent runs `convert_pdf_to_text`. It strips away the layout, leaving only clean text that your downstream system can reliably parse.

### Archiving Web Articles
Problem: A content manager needs to archive a live web article for documentation. They don't want the raw HTML mess. Solution: The agent uses `convert_html_to_pdf` on the URL, getting a polished PDF that preserves the look and feel of the original page.

## Benefits

- **Generate compliant invoices instantly.** Stop building financial docs from scratch. Use `generate_business_document` with structured JSON data to output professional, ready-to-send PDFs.
- **Get pixel-perfect screenshots.** Need a visual confirmation? Run `convert_html_to_image`. It grabs an exact PNG or JPG of any web page at specific dimensions for reports.
- **Analyze PDF content quickly.** If you only need the text, not the formatting, use `convert_pdf_to_text`. This strips out all the junk and gives your agent pure, clean data for analysis.
- **Re-use PDFs in code.** Don't treat PDFs as dead ends. Use `convert_pdf_to_html` to turn a static report back into HTML so you can edit it or pass it through other web services.
- **Handle complex conversions reliably.** Need a PDF and an image of the same page? You run two tools: `convert_html_to_pdf` for the file, and `convert_html_to_image` for the screenshot. It's built for that kind of multi-step workflow.

## How It Works

The bottom line is that it gives your AI client a single point of access to multiple document format conversions and generation templates.

1. First, subscribe to the Pdfcrowd MCP Server and enter your API credentials into your AI client.
2. Next, prompt your agent with a specific task. For example: 'Use `convert_html_to_pdf` on this URL.'
3. Finally, you get back base64 encoded data—a PDF, PNG, or text block—that you can immediately process or download.

## Frequently Asked Questions

**How do I convert an HTML string to a PDF using `convert_html_to_pdf`?**
You pass the raw HTML string directly as the input argument. The server handles the conversion and returns the base64 encoded PDF data, which you can then decode and use.

**What is the difference between `convert_pdf_to_text` and `convert_pdf_to_html`?**
They serve different purposes. Use `convert_pdf_to_text` when you only want raw, simple data for quick analysis. Use `convert_pdf_to_html` if you need the content to remain editable in a web format.

**Can I make an invoice using `generate_business_document`?**
Yes. This tool is specifically built for that. You provide structured JSON data (like item names, quantities, and prices), and it generates a formatted PDF invoice or receipt.

**Does `convert_html_to_image` include advanced styling options?**
It captures a screenshot of the content you pass in. While it's accurate to what's displayed, remember it is an image tool, not a format processor.

**What information is required when setting up access for a tool like `convert_html_to_pdf`?**
You must provide your Pdfcrowd Username and API Key. These credentials allow your AI client to authenticate the request before any conversion process begins.

**When I use a conversion tool like `convert_pdf_to_text`, what format is the resulting data provided in?**
The output is always delivered as base64 encoded text data. Your AI client or agent must decode this string to get usable plain text.

**How do I specify custom viewport settings with `convert_html_to_image`?**
You control the output by specifying dimensions, orientation, and viewports. This allows you to capture screenshots with pixel-perfect size requirements.

**For `generate_business_document`, what structure does the required JSON data need to follow?**
The tool requires structured JSON input that matches its pre-built templates. You must include details like line items, totals, and other necessary fields for accurate document generation.

**Can I convert a raw HTML string instead of a URL?**
Yes! Use the `convert_html_to_pdf` or `convert_html_to_image` tools and provide your HTML code in the `text` parameter instead of using the `url` parameter.

**How do I generate a professional invoice from my data?**
Use the `generate_business_document` tool. Provide the `document_type` as 'invoice', and include your line items, total, and currency in the JSON payload to get a styled PDF.

**Is it possible to extract plain text from a PDF file?**
Absolutely. Use the `convert_pdf_to_text` tool with the URL of your PDF. You can also enable `no_layout` if you want the text in reading order without layout preservation.