Parseur MCP. Turn Messy PDFs and Emails Into Actionable JSON Data
Parseur automates document processing and data extraction for your AI agents. It connects directly to complex pipelines, allowing you to upload PDFs, emails, or bulk documents and extract structured fields—like invoice numbers, total amounts, dates, and line items—into usable JSON format. You define the rules using templates, and our OCR engine handles the rest, turning unstructured paper into actionable data points for your workflow.
Give Claude and any AI agent real-world access
You set up dedicated parsing mailboxes for specific document types like invoices or emails.
You create templates that map fields and define the precise rules needed to pull structured data from documents.
Your agent uploads document URLs or raw payloads into a configured mailbox queue.
You pull the fully extracted JSON data from documents once they have finished processing.
You list all processed or failed documents to track a batch job’s progress and status.
If an extraction fails due to a minor error, you can instantly push the document back into the pipeline for reprocessing.
Ask an AI about this
Waiting for input…
What AI agents can do with Parseur: 10 Document Parsing Tools
These tools let your agent manage entire document pipelines—from creating new parsing rules to retrieving the final structured JSON output.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using Parseur MCPCreate Mailbox
Sets up a new dedicated parsing pipeline, specifying if the mailbox handles PDFs, emails, or attachments.
Create Template
Defines the extraction rules and field mappings needed for the system to pull...
Get Document Data
Retrieves the complete, parsed JSON dictionary of extracted fields from a document...
Get Document Details
Fetches only the metadata about a single parsed document, such as its ID and status...
Get Mailbox
Provides detailed configuration information for a specific parsing mailbox to...
List Documents
Lists all documents within a mailbox, showing their ID, current status (processed/failed), and date details.
List Mailboxes
Retrieves a list of every existing parsing pipeline configured for the account, along with their unique IDs.
List Templates
Shows all defined extraction templates associated with a mailbox, detailing the...
Retry Document
Forces a failed or errored document back into the parsing queue so it can be matched...
Upload Document
Sends a document URL to a specified mailbox, immediately entering the file into the...
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on each call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Parseur, then connect any of our 5,200+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,200+ others, all in one place
- Add new capabilities to your AI anytime you want
- Connections are secured and governed automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog weekly
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Parseur. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS CLOUD
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on each call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
The constant copy-paste tax on your team's time
Think about the end of the month: opening dozens of PDFs and emails. You open one, find the invoice number in a corner, copy it into your spreadsheet. Open the next three, repeating that process—finding the total amount, pasting it, verifying the date. Your workflow becomes a cycle of clicking 'View Attachment,' manually reading, selecting text, copying, and pasting. This is tedious, error-prone work.
With this MCP, you just point your agent at the folder full of documents. The system handles opening, reading, parsing, and extracting those specific fields into clean data packets. You don't copy anything; you receive a ready-to-use JSON object containing everything you needed.
Extract structured data with Parseur
Manual steps like verifying the document status, defining which fields are critical for extraction, and confirming that all necessary templates are active disappear. You manage these processes using tools like `get_mailbox` and `list_templates`, letting your agent handle the complexity.
The difference is control. Instead of hoping a spreadsheet formula catches everything, you define the rules explicitly. Your data moves from being 'a PDF' to being 'Invoice Number: A-201, Total Amount: 1400.99'—and that structure never changes.
What Parseur MCP does for your AI
When you need to read things that aren't in neat tables, this MCP is what you use. Forget manually opening every PDF or email attachment just to pull out an invoice number. This connector lets your agent process entire document streams automatically. It handles the messy stuff—whether it’s a scanned receipt with skewed text or a multi-page email thread.
You set up specific mailboxes and templates, telling the system exactly what fields you need (e.g., 'invoice total' or 'date'). Then, when documents arrive, your agent pushes them through the pipeline for parsing. The result is clean JSON data that your next step can use immediately. If you’re managing document logic across multiple AI clients, Vinkius makes connecting this entire process reliable and straightforward.
019d75ef-8869-708e-8e07-012b5684d5fd How to set up Parseur MCP
The bottom line is that this MCP takes unstructured files and converts them into predictable, structured JSON objects for any application or agent to use.
First, list all available parsing pipelines using list_mailboxes or create new ones with create_mailbox to define what type of documents you process.
Next, use create_template to build the extraction rules and tell the system exactly which fields (e.g., total amount) you expect to find in those documents.
Finally, run your workflow by uploading a document URL using upload_document; after processing, retrieve the clean data structure with get_document_data.
Who uses Parseur MCP
Anyone dealing with high volumes of varied physical documents—invoices, contracts, receipts—will need this. If your job involves reading data from PDFs that aren't in neat tables, stop using spreadsheets and start connecting this MCP.
They use the tool to automate processing new vendor invoices, ensuring every required field like PO number and net total is captured instantly without manual data entry.
They rely on it to batch process large sets of scanned receipts or financial statements, extracting key metrics into a usable JSON format for monthly reporting.
They use the tool to simulate document loads and test how data flows from an external source into their internal record-keeping systems via webhook logic.
Benefits of connecting Parseur MCP
Stop manually pulling data. By using upload_document, you route entire batches of documents into the parsing engine, getting clean, structured output instantly.
Handle different document types without changing logic. You can define multiple pipelines—one for invoices, one for receipts, etc.—using separate mailboxes and templates.
Don't get stuck on failed documents. If a parse fails due to an error, just call retry_document to re-run the pipeline against that specific document ID.
Get exactly what you need. Use get_document_data to retrieve only the structured fields (like total amount and date) without getting bogged down in raw metadata.
Understand your setup before sending files. Check the mailbox configuration with get_mailbox to verify that the correct parsing rules are active for a given document type.
Parseur MCP use cases
Processing End-of-Month Vendor Invoices
The AP manager needs to process 50 invoices from different vendors. Instead of manually entering the invoice number and total into a ledger, they use list_mailboxes to identify the 'Vendor Invoice' pipeline and then run their agent to execute upload_document for all 50 files, retrieving structured data via get_document_data.
Cleaning up Failed Scans
A batch of scanned receipts failed parsing due to a bad template rule. Instead of manually fixing the documents, an agent uses list_documents to identify the failed IDs and then calls retry_document, forcing the system to re-run the OCR against the fixed template.
Building a Multi-Source Data Stream
A developer needs to ingest both PDF contracts and email attachments. They use list_mailboxes to confirm two separate pipelines exist, then route documents using upload_document into the correct stream for parsing.
Debugging Data Flow
An integration needs to verify if a document is ready for processing. It first calls get_mailbox to check the configuration details before attempting any file uploads, ensuring data integrity across systems.
Parseur MCP tradeoffs
What to watch out for, and the recommended way to handle each one.
Assuming all documents are the same format
Trying to extract a 'Date' field from both a handwritten note and a formal PDF using one single template fails because the system doesn't know which parsing engine to use.
You must define separate, specific pipelines. First, create_mailbox for 'Handwritten Notes', then another with create_mailbox for 'Formal PDFs'. Use different templates and rules for each type.
Only retrieving document metadata
Using only the tool to get basic details is useless because you still have to copy-paste the actual invoice amount from a separate view.
After confirming the status with get_document_details, always follow up by calling get_document_data to pull the structured, usable JSON fields directly.
Manually checking every document for errors
A job fails on 10 documents; the user has to open each one and manually determine if a retry is needed.
List all failed jobs using list_documents. Then, use retry_document across the batch of IDs that need fixing. This automates the error remediation.
When to use Parseur MCP
Use this MCP when your input data source is unstructured (scanned PDFs, emails, varied forms) or semi-structured but messy. If you can't reliably point to a field using simple XPaths or if the format changes often, you need Parseur. Don't use it if you are dealing with clean databases where every field already has a consistent schema and is stored in a single source of truth (in that case, standard database connectors work fine). Also, don't use this just to read text; use get_document_data because it guarantees the data comes back as structured JSON fields. If you only need basic file management, simply listing documents with list_documents is enough.
Frequently asked questions about Parseur MCP
How do I get started with Parseur and structured data? +
You start by calling list_mailboxes to see what pipelines are available or creating a new one using create_mailbox. Then, you define the rules for that pipeline using create_template.
Does Parseur handle scanned documents? +
Yes. The MCP uses powerful OCR logic to read text from images and scans. You just need to upload the document via upload_document, and the engine handles the rest of the parsing process.
What is the difference between get_document_data and list_documents? +
Use list_documents when you only want a summary table showing which files exist and their status. Use get_document_data when you need the actual, fully parsed structured data from one specific file ID.
Can I fix documents that failed parsing using Parseur? +
Absolutely. If a document fails validation, use list_documents to get the IDs of the failures, and then call retry_document to force a fresh parse run.
What is an 'extraction template' in Parseur? +
An extraction template defines the rules—the field names, locations, and regex patterns—that tell the system exactly what data points (like tax ID or date) to pull from a messy document.
How do I test my parsing setup before uploading files? +
Before running upload_document, it's smart to first use get_mailbox and list_templates. This lets you review the current configuration, ensuring your rules are set up correctly.