# Ensembl MCP

> Ensembl MCP gives you direct access to vast genomic data, letting your AI client pull gene trees, alignments, homologies, and cross-references from the Ensembl database in natural conversation.

## Overview
- **Category:** databases
- **Price:** Free
- **Tags:** genomics, bioinformatics, gene-trees, dna-sequencing, biological-data

## Description

Think of this as bypassing weeks of scripting. Instead of writing custom Python or R code just to compare genes across species, you talk to the MCP. You can ask for everything from finding all orthologs (the genes that evolved from a common ancestor) to checking if two different identifiers refer to the same thing. It’s like having a senior bioinformatician sitting next to you who knows every corner of the Ensembl database and can pull up any data point instantly, whether it's calculating linkage disequilibrium or just listing available species. When you connect this MCP through Vinkius, your AI client handles all the complex API calls—you just ask what you need. It’s pure biological querying without the boilerplate code.

## Tools

### get_alignment
Pulls full genomic sequence alignments for a defined region.

### get_archive_bulk
Finds the newest version of multiple identifiers at once.

### get_archive_id
Determines the current stable version for a single identifier.

### get_ga4gh_beacon
Provides allele information using the GA4GH beacon service.

### search_ga4gh_variants
Searches for genetic variants using the standardized GA4GH schema.

### get_genetree
Retrieves a visual gene tree structure for a specific identifier.

### get_homology
Gets information about genes that share common ancestry across different species.

### get_info_assembly
Lists all available chromosome assemblies for a given organism.

### get_info_rest
Shows the current version details of the Ensembl REST API.

### get_info_species
Lists every available species and their associated metadata within the database.

### get_ld
Calculates Linkage Disequilibrium (LD) values for a region of DNA.

### get_lookup_bulk
Performs bulk lookups to find information for many identifiers simultaneously.

### get_lookup_id
Finds the corresponding species and database type for a single identifier.

### get_map_cdna
Translates cDNA sequence coordinates back into genomic coordinates.

### get_map
Converts coordinates from one assembly version to another format.

### get_ontology_id
Searches for a specific term using its ontological identifier.

### get_overlap_region
Identifies features that overlap a defined genomic region.

### ping
Checks if the entire data service is currently operational.

### get_sequence_id
Requests the full DNA sequence based on a stable identifier.

### get_sequence_region
Retrieves a segment of genomic DNA by specifying start and end points.

### get_taxonomy_id
Searches for biological classification terms using either an ID or name.

### get_variation
Fetches details about genetic variants, including population data and genotypes.

### get_vep_bulk
Predicts the functional consequences for many DNA regions at once.

### get_vep_hgvs
Determines variant consequences using standardized HGVS nomenclature.

### get_vep_id
Predicts variant effects when given a recognized ID like an rsID.

### get_xrefs_id
Pulls external reference links for any Ensembl identifier.

### get_xrefs_symbol
Looks up a common gene symbol and returns all linked Ensembl objects.

## Prompt Examples

**Prompt:** 
```
What is the latest version of the Ensembl identifier ENSG00000139618?
```

**Response:** 
```
I've checked the archive for ENSG00000139618. The latest version is version 11, which is currently active in the latest assembly.
```

**Prompt:** 
```
Find all orthologues for the human gene ENSG00000139618 in mouse.
```

**Response:** 
```
Searching homologies... I found 1 high-confidence orthologue in Mus musculus: ENSMUSG00000041147 (Brca2). Would you like the alignment details?
```

**Prompt:** 
```
List all species currently available in the Ensembl database.
```

**Response:** 
```
I've retrieved the species list. There are over 300 species available, including Homo sapiens, Mus musculus, Danio rerio, and many others. Do you want to filter by a specific taxon?
```

## Capabilities

### Find gene relationships across life
Retrieve complete gene trees and homology data for specific genes or species.

### Pinpoint sequence variations
Determine the functional consequences of genetic variants using standardized notation like HGVS.

### Map coordinates between versions
Convert genomic or cDNA coordinates when switching between different assembly versions.

### Cross-reference identifiers
Link an Ensembl object to external databases using common symbols like BRCA2.

### Analyze large regions of DNA
Get genomic alignments or calculate linkage disequilibrium across specific DNA sequences.

## Use Cases

### Comparing genes between human and mouse
A researcher needs to find all orthologs for a specific human gene in the mouse genome. They ask their agent, which uses `get_homology`, and immediately get a list of high-confidence matches, including necessary alignment details.

### Mapping coordinates across assembly updates
A data scientist has genomic coordinates from an older dataset but needs to run analysis against the latest build. They use `get_map` to convert those old coordinates into the current, accurate format for their pipeline.

### Investigating a novel variant
A clinician identifies a rare variant using an rsID (e.g., rs12345). They prompt their agent with `get_vep_id`, and the MCP returns a detailed report on the predicted functional impact of that specific mutation.

### Getting a full picture of a gene
A student needs to understand every facet of a gene. They ask their agent, which uses `get_xrefs_id`, and it pulls not only the sequence but also external links connecting that gene to other databases.

## Benefits

- Stop manually writing scripts for comparative genomics. With `get_homology` and `get_genetree`, you simply ask your agent to find gene trees or orthologs, getting the results instantly without needing custom R code.
- Save time validating data integrity. Instead of guessing if an identifier is current, use `get_archive_id` or `get_lookup_bulk` to reliably check the latest version and species for any ID you encounter.
- Handle complex variant analysis efficiently. Need to know what a mutation *does*? Use `get_vep_hgvs`, which takes standardized HGVS notation and immediately gives you the predicted biological consequence, like an amino acid change.
- Cross-reference data without effort. If you have a gene symbol (e.g., BRCA2), use `get_xrefs_symbol` to pull all linked Ensembl objects across different databases in one go.
- Process massive datasets faster than ever before. Tools like `get_vep_bulk` let you predict variant effects for multiple regions simultaneously, which is a huge time saver for large-scale analyses.

## How It Works

The bottom line is you get complex genomic reports without writing a single script.

1. First, you connect your preferred AI client to this MCP on the Vinkius Marketplace and supply the necessary API configuration.
2. Next, you simply ask your agent a natural language question, like 'What are the homologies for gene X in species Y?'
3. The MCP translates that request into multiple specific database calls, gathers the data (e.g., getting alignments or checking variant features), and hands back the clean result to your agent.

## Frequently Asked Questions

**How do I check if an identifier is current using get_archive_id?**
Run `get_archive_id` with the specific Ensembl identifier you have. The MCP will return the latest stable version number and confirm its status within the most recent assembly.

**What's the difference between get_homology and get_genetree?**
`get_homology` provides detailed information about related genes (orthologs/paralogs). `get_genetree`, however, outputs the actual tree structure showing how those genes are related evolutionarily.

**Can I find external IDs using get_xrefs_symbol?**
Yes. If you provide a common gene symbol (like BRCA2), `get_xrefs_symbol` will look up all the linked Ensembl objects and return cross-references to other databases.

**How do I predict variant effects with get_vep_hgvs?**
You provide the variant in HGVS notation (e.g., NM_00123:c.123A>G). The MCP uses `get_vep_hgvs` to run a consequence prediction, telling you if it's a missense mutation or something else.

**What is the purpose of get_lookup_bulk?**
`get_lookup_bulk` lets you check multiple identifiers at once. Instead of running individual lookups for 20 genes, you input all 20 and get a comprehensive report back.

**How do I check if the Ensembl API is running correctly using ping?**
The `ping` tool confirms immediate service availability. It simply sends a health check request to verify that the MCP connection and the underlying Ensembl REST API are online and accepting commands.

**What is the difference between getting coordinates with get_map versus using get_map_cdna?**
Use `get_map` when you need to convert genomic coordinates from one assembly version to a different one. Meanwhile, `get_map_cdna` specifically handles converting cDNA sequences into their corresponding genomic locations.

**How do I get metadata for all available species using get_info_species?**
The `get_info_species` tool lists every organism supported by the Ensembl database. This allows you to identify which taxa are available before attempting specific analyses like homology lookups.

**How can I find orthologs for a specific gene across different species?**
Use the `get_homology` tool by providing the species name and the Ensembl gene ID. You can filter by type (e.g., 'orthologues') to see related genes in other organisms.

**Can I retrieve the evolutionary gene tree for a specific identifier?**
Yes! The `get_genetree` tool allows you to fetch the gene tree for any stable Ensembl ID, with options for alignment and sequence types (protein or cdna).

**How do I map a common gene symbol like 'BRCA2' to its Ensembl ID?**
Use the `get_xrefs_symbol` tool. Provide the species (e.g., 'human') and the symbol 'BRCA2' to retrieve all linked Ensembl objects and their stable identifiers.