# BibTeX Bibliography Parser MCP

> BibTeX Bibliography Parser processes academic .bib files instantly, converting raw reference lists into structured JSON data. It lets your AI agent analyze hundreds of citations, count specific resource types, and reformat entries into APA, IEEE, or Chicago styles—all from a single file path without external dependencies.

## Overview
- **Category:** productivity
- **Price:** Free
- **Tags:** bibtex, bibliography, academic, citations

## Description

Students and researchers spend way too much time wrestling with citation formats. Manual formatting is error-prone. This MCP solves that by reading the entire BibTeX structure using deterministic regex parsing. It doesn't guess; it processes the data reliably, giving you clean JSON entries for everything: type, key, author, year. You can then ask your agent to filter those results—for example, pulling out every book published after 2015 or counting how many proceedings you have versus articles. Because this MCP handles the raw parsing first, any compatible client, including agents connected through the Vinkius catalog, gets a perfectly structured dataset ready for analysis.

## Tools

### parse_bibtex_bibliography
Parses a full academic .bib file path into structured JSON data for immediate querying by your agent.

## Prompt Examples

**Prompt:** 
```
Convert all entries from 2023 in my .bib to APA format.
```

**Response:** 
```
Here are 12 entries from 2023 reformatted in APA style.
```

**Prompt:** 
```
How many articles vs books do I have?
```

**Response:** 
```
Your bibliography has 85 articles, 12 books, and 8 inproceedings.
```

**Prompt:** 
```
Find all references by author 'Smith' in my bibliography.
```

**Response:** 
```
Found 7 entries authored by Smith across years 2018-2024.
```

## Capabilities

### Structure references into JSON
Converts an entire academic bibliography file into clean, machine-readable JSON entries.

### Count resource types
Determines and reports the total count of different reference categories (e.g., articles, books, proceedings).

### Reformat citations by style
Allows your agent to reformat existing entries into major academic styles like APA, IEEE, or Chicago.

### Query references by criteria
Filters the bibliography data based on specific details, such as a target author's name or a publication year range.

## Use Cases

### Checking resource balance for a literature review
A researcher has compiled a bibliography of over 300 sources. Instead of reading through it, they ask their agent to query the list and report: 'How many books versus conference proceedings do I have?' The MCP immediately returns a count (e.g., 45 books, 12 proceedings) in JSON format.

### Preparing references for journal submission
A student has finished writing a paper and needs to update the bibliography from an old style to APA format before submitting it. They run the MCP to reformat all entries, getting clean, correctly styled citations ready to paste.

### Identifying core contributors
A team lead wants to see who contributed most often to a project's bibliography. They ask their agent to search the list and find every entry authored by 'Dr. Chen,' quickly identifying 15 relevant papers.

### Data validation for academic databases
A developer needs to validate if a source file contains all necessary fields (type, author, year). They use the MCP to parse the JSON output and confirm that every entry has a valid citation key and publication type.

## Benefits

- Stop formatting manually. The parser handles the raw data cleanup, delivering perfectly structured JSON entries instantly for your agent to use.
- Instantly audit your sources. You can ask your AI client how many articles versus books you actually have without running complex scripts or counting lines in a text editor.
- Flexible citation output. Whether it's APA, IEEE, or Chicago, the MCP makes sure your references conform to the required style guide every time.
- Deterministic parsing means reliability. It doesn't rely on fragile external libraries; pure regex handling gives you dependable results for large files.
- Query sources by detail. You can ask it to find every reference written by 'Smith' or only those published between 2018 and 2024.

## How It Works

The bottom line is: it takes a large text file and turns it into perfectly organized, queryable data points in three steps.

1. Provide the MCP with the absolute file path to your academic .bib bibliography file.
2. The system runs deterministic regex parsing across the entire document structure, generating raw JSON data.
3. You receive clean JSON entries ready for immediate querying or reformatting by your AI agent.

## Frequently Asked Questions

**Does it handle LaTeX special characters?**
It extracts the raw field values as-is. The AI can then interpret or clean LaTeX escapes like \'{e} into proper Unicode.

**How many entries can it handle?**
It caps the output at 200 entries to protect AI context. For larger bibliographies, ask the AI to filter by type or year.

**Can it detect duplicate references?**
The parser extracts all entries. You can then ask the AI: 'Find duplicate titles or DOIs in my bibliography.'

**What structured data does `parse_bibtex_bibliography` output for academic references?**
It outputs clean, deterministic JSON containing key fields like type, citation key, title, author list, and publication year. This structure allows your AI client to immediately query the data without needing manual parsing.

**If I run `parse_bibtex_bibliography` on a corrupted .bib file, how does it handle the parsing error?**
The tool is built with robust regex logic and generally handles malformed entries gracefully. It processes the valid data and flags or skips sections that contain non-standard formatting errors.

**Is `parse_bibtex_bibliography` dependent on external libraries?**
No, it's pure regex parsing. This means it doesn't require additional dependencies or complex setups from your side; it just needs the file path to run.

**After using `parse_bibtex_bibliography`, can I programmatically filter entries by specific fields like author or year?**
Yes. Since the output is clean JSON, you'll get structured access to every field. You can easily write logic in your agent to filter results based on any available key-value pair.

**Does the processing time of `parse_bibtex_bibliography` scale linearly with the size of my bibliography file?**
Generally, yes. Because it uses deterministic regex parsing, performance scales reliably. You can expect the run time to increase predictably as you add more entries.