# EBI PDBe MCP

> EBI PDBe provides immediate access to 3D protein structures, ligand interactions, and molecular assemblies from the Protein Data Bank in Europe. This MCP lets you analyze complex biological systems—from determining if a protein forms a dimer to pinpointing exact binding sites for drug design—all without downloading massive coordinate files.

## Overview
- **Category:** the-unthinkable
- **Price:** Free
- **Tags:** pdb, protein-structure, 3d-structures, structural-biology, drug-discovery, embl-ebi, crystallography

## Description

This connector lets your AI client function like a structural biology research assistant, giving you direct access to the world’s repository of experimentally determined macromolecular structures. You can query specific proteins or even search across entire classes of molecules using natural language. For example, you don't need to know the exact PDB ID; you just ask for 'SARS-CoV-2 spike protein,' and the tool finds it. From there, you can drill down into functional details: identifying which small molecule ligands are bound, checking if a structure is reliable enough for publication, or mapping out how different protein chains assemble together. It’s about getting immediate, actionable structural data without having to write complex API calls or download large files first. You manage all this complexity through the Vinkius catalog, making it simple to plug into your existing workflow.

## Tools

### get_assemblies
Determines the biological assembly state of a protein to see if it's a monomer, dimer, or larger complex.

### get_binding_sites
Locates specific residues and interactions that form binding pockets for small molecules.

### get_cofactors
Lists essential cofactors, like metal ions or heme groups, necessary for the protein's function.

### get_experiment
Retrieves detailed information about the experimental method used to determine the structure.

### get_ligand_monomers
Gets a list of small molecule ligands bound in the crystal structure, including their formulas and weights.

### get_modified_residues
Shows non-standard amino acids or nucleotides present in the sequence data.

### get_molecules
Retrieves detailed inventory of all molecular entities, like chains and polymers, within a given structure.

### get_mutated_residues
Highlights specific residues that have been engineered or mutated compared to the native protein sequence.

### get_publications
Finds primary citations and PubMed IDs associated with a specific structural determination.

### get_quality_scores
Calculates global metrics for the structure, giving an immediate assessment of its overall reliability.

### get_related_entries
Discovers alternative conformations or mutant versions of a protein that have been studied previously.

### get_residue_listing
Generates an inventory list of residues, organized by chain and entity, for detailed inspection.

### get_secondary_structure
Assigns the protein's fold topology by counting helices, strands, and coils per residue.

### get_summary
Provides a quick overview of the PDB entry, including its title, authors, and resolution.

### get_uniprot_mapping
Creates a cross-reference map linking UniProt sequence positions to specific residues in the 3D structure.

### search_structures
Searches across the entire database using natural language queries for structures based on method, organism, or function.

## Prompt Examples

**Prompt:** 
```
Show me the structure summary and quality of the SARS-CoV-2 main protease (6lu7).
```

**Response:** 
```
I've retrieved the structure summary for 6LU7, the SARS-CoV-2 main protease. It was determined by X-ray crystallography at 2.16 Å resolution. The structure contains the protease in complex with an inhibitor, providing key insights for antiviral drug design. The quality metrics show good geometry and reliable electron density.
```

**Prompt:** 
```
What ligands are bound in the hemoglobin structure 4hhb?
```

**Response:** 
```
I've retrieved the ligand data for hemoglobin (4HHB). The structure contains heme (protoporphyrin IX) groups as cofactors bound to each of the four globin chains. The heme groups contain iron atoms essential for oxygen binding. I also found the binding site residues that coordinate the heme cofactors.
```

**Prompt:** 
```
Search for cryo-EM structures of ribosome complexes.
```

**Response:** 
```
I've found multiple ribosome structures determined by cryo-electron microscopy. The search returned entries from various organisms including E. coli, human, and yeast ribosomes at resolutions ranging from 2.0 to 4.0 Å. These include 70S and 80S ribosomes in complex with mRNA, tRNA, and various translation factors.
```

## Capabilities

### Determine Molecular Assembly
Identifies if a protein functions as a monomer, dimer, or higher-order complex by analyzing its quaternary structure.

### Map Binding Sites and Ligands
Pinpoints exact residues where small molecule ligands bind to proteins, which is critical for drug design research.

### Assess Structure Reliability
Retrieves global quality metrics like resolution and R-factors, letting you immediately vet a structure's scientific validity.

### Cross-Reference Protein Sequences
Maps residue numbers between general protein sequence databases (UniProt) and the 3D structural annotations (PDB).

### Search Structures by Function
Performs full-text searches across thousands of structures using natural language queries, finding candidates based on method or organism.

## Use Cases

### Evaluating a Novel Inhibitor
A chemist needs to know if their new molecule can bind to the target protein. They use `search_structures` first, then call `get_binding_sites` on promising candidates to see available pockets; finally, they check `get_ligand_monomers` to confirm the type of small molecules that fit.

### Investigating Protein Evolution
A bioinformatician is comparing two related proteins. They use `get_related_entries` to find alternative conformations and then call `get_mutated_residues` to see exactly how the sequence differs from the wild-type.

### Determining Protein Architecture
A structural biologist is trying to understand if a protein acts alone or in groups. They use `get_assemblies` to confirm its quaternary structure, then call `get_secondary_structure` to map out the specific helices and sheets involved.

## Benefits

- You can instantly assess the structural reliability of any candidate using `get_quality_scores`, saving time spent on manual data vetting. Knowing if a structure is high-resolution or poor geometry changes your entire hypothesis.
- Drug development gets an edge when you use `get_binding_sites` to pinpoint exactly where potential ligands fit and interact, guiding rational drug design efforts.
- When validating sequence data, the `get_uniprot_mapping` tool eliminates guesswork by creating a reliable cross-reference between general protein databases and 3D residue coordinates.
- If you're studying enzyme function, checking for essential metal ions or cofactors using `get_cofactors` is critical, letting you understand what makes the protein work in vivo.
- The `search_structures` tool lets you skip manual browsing. You just ask for 'cryo-EM structures of ribosome complexes,' and it pulls up all candidates immediately.

## How It Works

The bottom line is: you get deep, specific structural knowledge without ever touching a database console or worrying about endpoint details.

1. Subscribe to this MCP and connect your preferred AI client. No API key is needed because the underlying PDBe data is public.
2. Ask your agent a structural question, like 'What cofactors are bound in 4hhb?' or 'Show me all complexes related to X.'
3. The MCP executes the query against the official PDBe REST API and returns structured biological data directly to your client for analysis.

## Frequently Asked Questions

**How do I use the get_binding_sites tool?**
You call `get_binding_sites` with a PDB ID to retrieve specific residues and interactions that form binding pockets, which is essential for drug discovery.

**What does search_structures do?**
`search_structures` allows you to query the entire database using natural language—for example, 'cryo-EM structures of ribosome complexes'—to find relevant PDB IDs immediately.

**Should I use get_assemblies or get_molecules?**
`get_assemblies` gives you the high-level context (dimer vs. monomer), while `get_molecules` provides a detailed list of every type and chain present in the structure.

**Is it possible to map UniProt data with this MCP?**
Yes, you use `get_uniprot_mapping` to generate a cross-reference table that links sequence positions from general databases (UniProt) directly to the residue numbers in the 3D structure.

**How can I verify the reliability of a protein structure using get_quality_scores?**
It returns global metrics like R-factors and resolution. This lets you assess how reliable the structure is, which structural biologists check first before drawing any conclusions.

**If I need to identify all distinct components in a complex, should I use get_molecules?**
Yes, this tool returns IDs and types for every molecular entity. You get chain assignments, sequence lengths, weights, and source organisms listed across the whole structure.

**Where can I find alternative forms or related structures of a known protein using get_related_entries?**
This tool finds other PDB entries that are structurally linked to your original query. It’s useful for comparing different conformations, mutants, or complexes.

**How do I find essential metal ions or prosthetic groups using get_cofactors?**
It retrieves annotations for cofactors like heme, NAD+, and various metal ions. This tells you exactly which non-protein chemical components are bound to the structure.

**Do I need an API key?**
No. The PDBe API is completely public and requires no authentication. Enter any placeholder value in the API key field to activate the server immediately.

**What types of structures are available?**
The PDBe contains over 200,000 experimentally determined 3D structures of proteins, nucleic acids, and complex assemblies. Structures are determined by X-ray crystallography, cryo-electron microscopy (cryo-EM), NMR spectroscopy, and other methods. This includes enzymes, receptors, antibodies, viral proteins, ribosomes, and drug-target complexes.

**Can I find drug binding sites?**
Yes. Use get_binding_sites to retrieve all annotated ligand binding pockets with their constituent residues. Combine with get_ligand_monomers to identify the small molecules bound in the structure, and get_cofactors for prosthetic groups. This workflow is essential for structure-based drug design and virtual screening target preparation.