PDF.co MCP. Extract structured data and convert documents from chat.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
PDF.co lets your AI client handle all document processing—parsing, converting, merging, and securing PDFs right in the chat window. You use it to extract structured data like tables into JSON or CSV formats, perform OCR on scanned images, or combine multiple reports into one file.
It’s a full suite of tools for turning messy documents into clean, actionable data pipelines without ever leaving your conversation.
What your AI agents can do
Check job status
Checks the status of any document processing job that was run asynchronously.
Pdf to csv
Converts data presented in PDF tables directly into a Comma Separated Values (CSV) file.
Pdf to json
Extracts and structures the entire content of a PDF document into a standardized JSON object.
Transform PDFs and images into specific formats like JSON, CSV, XML, or plain text using tools such as pdf_to_json or pdf_to_csv.
Pull metadata from a PDF with extract_pdf_meta, extract tables into structured formats via pdf_to_json, or perform OCR on images using ocr_image.
Use merge_pdfs to combine multiple PDFs into a single file, or use split_pdf to break one large document into smaller parts.
Apply password protection with protect_pdf, or remove existing passwords using unprotect_pdf on PDF files.
Check the progress of background processing jobs with check_job_status, and view your service credit balance via get_account_info.
Ask AI about this MCP
Supported MCP Clients
OAuth 2.0 CompatibleWaiting for input…
PDF.co MCP Server: 12 Tools for Document Processing
This server gives your AI client everything it needs to handle PDFs—from converting data formats and reading scans to merging files and controlling security settings.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using PDF.co on Vinkius019dd138check job status
Checks the status of any document processing job that was run asynchronously.
019dd138pdf to csv
Converts data presented in PDF tables directly into a Comma Separated Values (CSV) file.
019dd138pdf to json
Extracts and structures the entire content of a PDF document into a standardized JSON object.
019dd138pdf to text
Converts an entire PDF file into simple, clean plain text format.
019dd138pdf to xml
Converts a PDF document's content and structure into an XML file.
019dd138extract pdf meta
Extracts general metadata (like creation date, author, and title) from a PDF file.
019dd138get account info
Retrieves your current account usage metrics and service credit balance.
019dd138merge pdfs
Combines two or more separate PDF documents into a single output file.
019dd138ocr image
Runs Optical Character Recognition on an uploaded image to extract text, even if the original document was scanned.
019dd138protect pdf
Adds password protection to a PDF, restricting access or editing capabilities.
019dd138split pdf
Cuts one large PDF document into multiple smaller PDFs based on page numbers or ranges.
019dd138unprotect pdf
Removes existing password protection from a locked PDF file.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with PDF.co, then connect any of our 5,000+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,000+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by PDF.co. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 12 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Manually extracting data from PDFs feels like detective work.
Think about it: you get a quarterly report. You open the PDF, then you have to manually spot the revenue number, copy it into a spreadsheet cell, find the tax percentage on page 8, and paste that too. Then you repeat this process for ten different reports just because they're in PDF format.
With the PDF.co MCP Server, you tell your agent exactly what you need—like 'Give me the total revenue from all attached PDFs.' It runs the necessary tools (like `pdf_to_json`) and returns a clean, structured data block instantly. You get the answer, not just a file.
Use PDF.co MCP Server for predictable JSON output.
Before this server, if you needed to process tables from PDFs, you were stuck using `pdf_to_text` or relying on visual guesswork. The text conversion often loses the relationship between columns and rows, making the resulting data useless for automation.
Now, when you call `pdf_to_json`, the AI sees the document's internal layout—the tables, the headers, the fields—and maps them to a predictable JSON schema. You get reliable data that your code can actually use.
What you can do with this MCP connector
Listen up. This isn't some basic converter you use when you're bored. The PDF.co server gives your AI client a full suite of tools to handle documents—it’s for parsing, converting, merging, and locking down PDFs right in the chat window. You'll use it anytime you need your agent to deal with messy files and turn them into clean, actionable data without you having to copy-paste a single thing.
Converting Documents to Structured Data
You can make your AI client read every kind of PDF structure using dedicated conversion tools. If the document has tables, use pdf_to_csv and it'll spit out a perfect Comma Separated Values file you can use in any spreadsheet program. For deeper data analysis, run pdf_to_json to extract and structure all the content into a standardized JSON object that your code can actually read.
If you need something more rigid, pdf_to_xml converts the entire document's structure into an XML file format. Even if all you need is raw reading material, pdf_to_text handles converting the whole PDF down to simple, clean plain text.
Extracting Specific Data and Handling Scans
Sometimes you don't want the whole thing; you just need pieces of info. You can pull basic document information—like who created it, what the title is, or when it was made—by running extract_pdf_meta. If you've got a scanned invoice or some old paperwork that isn't digital text, don't sweat it.
Use ocr_image to run Optical Character Recognition on an uploaded image; it extracts usable text even if the original document was just ink on paper. For massive PDFs, if you only need sections three through five, use split_pdf, and it’ll cut that one big file into several smaller parts for you.
Combining and Managing Files
When your workflow requires multiple inputs, this server handles the heavy lifting. You can run merge_pdfs to take two or more separate PDF documents—say, quarterly reports from different departments—and combine them into a single output file. On the flip side of organization, you might need to mess with security.
If a document is locked down and you need access, use unprotect_pdf to strip away existing passwords so your agent can work on it. Conversely, if you're sending something sensitive, you can run protect_pdf to add password protection, restricting who can view or edit the file.
Utility and Monitoring
Your AI client keeps track of everything running in the background. When a big job—like converting 50 files—is queued up, use check_job_status to see exactly where that document processing is at. Plus, you can keep an eye on your usage with get_account_info, which pulls up your current service credit balance and account metrics so you know what's left.
Basically, it gives you the whole damn toolbox for making PDFs into usable data.
019dd138-7b3c-705b-a69d-4f1fc09c6f6d How PDF.co MCP Works
- 1 Subscribe to this server and provide your PDF.co API Key in the settings.
- 2 Ask your AI client to perform a document action (e.g., 'Convert this invoice to JSON').
- 3 The agent calls the appropriate tool, processes the file, and returns the structured data or resulting document.
The bottom line is: you tell your agent what needs doing with the PDF; it handles all the API calls and spits out the usable result.
Who Is PDF.co MCP For?
This tool’s best users are data-heavy roles that spend too much time switching between document readers, spreadsheet programs, and databases. Think Data Analysts, Accountants, or Operations Managers who deal with invoices, reports, and legal filings daily. If your job involves extracting numbers from a PDF report before you can use it in a dashboard, this is for you.
Uses pdf_to_json to pull structured data points (like revenue or line items) from complex PDFs so they can be fed into BI tools.
Automates the collection of documents by using merge_pdfs on multiple quarterly reports, then potentially securing them with protect_pdf for distribution.
Processes batches of scanned invoices by running ocr_image to pull text and metadata, making it easy to reconcile accounts without manual entry.
What Changes When You Connect
- Stop losing time on manual extraction. Use
pdf_to_jsonorpdf_to_csvto turn complex tables directly into machine-readable data, eliminating spreadsheet copy/paste errors. - Handle mixed media inputs instantly. Run
ocr_imageon scanned invoices and handwriting samples; it extracts text that simple PDF readers miss entirely. - Simplify document management workflows. Need to combine three quarterly reports? Use
merge_pdfs; the server handles stitching them together into one file, keeping all pages sequential. - Maintain data integrity across systems. Convert files using
pdf_to_xmlorpdf_to_json, ensuring your downstream application gets a clean, predictable schema every time. - Control document access right from chat. Apply security locks with
protect_pdfimmediately after processing sensitive client documents.
Real-World Use Cases
Processing a Batch of Client Invoices
An accountant gets 50 scanned invoices (JPEGs). Instead of manually typing in the vendor name, invoice number, and total for each one, they ask their agent to run ocr_image on all 50 files. The server extracts the necessary metadata from every image, allowing them to compile a master spreadsheet with zero manual data entry.
Building an Annual Compliance Binder
An operations manager needs to combine annual reports (Q1 through Q4) and ensure they're secure. They first use merge_pdfs to compile the 4 reports into one, then run protect_pdf on the final file before uploading it to the archive.
Converting Raw Report Data for a Database
A data analyst has a PDF report full of financial tables. They use pdf_to_json, which pulls out all column headers and values into a structured JSON object. The agent then passes this clean, predictable data directly to the database API.
Splitting Master Legal Documents
A legal team receives one massive 300-page agreement PDF. Instead of reading it all at once, they ask their agent to run split_pdf to separate the 'Definitions' section (pages 1-25) from the 'Exhibit A' section (pages 280-300), giving them smaller, manageable files.
The Tradeoffs
Treating all PDFs as simple text.
Copying and pasting the output of pdf_to_text into a database field when you actually need specific columns like 'Total' or 'Date'. You lose structure, making data useless for analysis.
→
Don't use pdf_to_text. Use pdf_to_json instead. It preserves the document's inherent structure—tables and fields—so your downstream system gets clean keys and values.
Ignoring job processing delays.
Asking the agent to process a massive 500-page PDF conversion, and then immediately asking 'What is the result?' without waiting. The request fails because the server hasn't finished running the background task.
→
After initiating a large task, always use check_job_status. This confirms the job is done before you ask for the final output.
Sharing sensitive documents unsecured.
Generating a PDF containing client payroll data and then sending it out via email without protection. Anyone with access can view or copy the raw information.
→
Always run protect_pdf on any document that contains PII or proprietary data immediately after you've finished assembling it.
When It Fits, When It Doesn't
Use this server if your primary bottleneck is getting clean, structured data out of unstructured documents (scans, reports, invoices). You need a tool that can convert PDF tables into JSON objects, or reliably extract metadata. The tools are best when you use them sequentially: e.g., first ocr_image to get the text from a scan, then pass that output through your agent logic to format it via structured conversion tools like pdf_to_json. Don't use this if your need is just simple viewing or editing; those are PDF reader functions. Also, don’t try to build an entire document management system on this—it handles processing; you still need a separate storage solution for the files themselves.
Common Questions About PDF.co MCP
How do I convert PDF tables into structured data using pdf_to_csv? +
You simply tell the agent to 'Convert this document's tables to CSV.' The tool handles identifying all tabular content and outputs it in a standard, delimited format ready for import.
What is the difference between pdf_to_text and pdf_to_json? +
The key difference is structure. pdf_to_text gives you one big block of raw text, losing all formatting. pdf_to_json analyzes the document's layout and organizes the content into labeled fields, keeping context.
Can I use ocr_image to read handwritten notes in a PDF? +
Yes. You pass the image through ocr_image. It runs Optical Character Recognition specifically designed for scanned or handwritten documents, extracting text that standard digital readers can't see.
How do I combine several PDFs into one using merge_pdfs? +
Just upload the files and tell your agent to 'Merge these three reports.' The merge_pdfs tool combines them sequentially into a single, cohesive PDF document for you.
How do I use `protect_pdf` to add password security to a document? +
The tool encrypts your PDF file. You provide the document and the desired credentials, which locks it down so only authorized users can view or edit the content.
What is the purpose of `check_job_status` after a conversion task? +
It lets you track long-running processes. Complex conversions take time; use this tool to monitor if your document job completed successfully or if it ran into an error.
How can I use `extract_pdf_meta` to get information about the PDF itself? +
It pulls out hidden document properties. This function reads key metadata like the author, creation date, and title embedded deep within the file structure.
If I only need specific pages, how does `split_pdf` work? +
You can break a large PDF into smaller parts. Just specify the exact page range or individual pages you want to extract and create new, separated documents.
Can my AI automatically find and extract a specific table from a PDF? +
Yes! Use the convert_to_csv or convert_to_json tools. Your agent will respond with the structured tabular data from the document in seconds, ready for analysis.
How do I find my PDF.co API Key? +
Log in to your PDF.co account, navigate to the main dashboard, and you will find your unique secret API key (starting with your email reference or key string) there.
Does this support handwritten text recognition? +
Absolutely. PDF.co's high-fidelity OCR engine is designed to handle both printed and handwritten text with high accuracy across multiple languages.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.