# KEGG MCP

> KEGG connects your AI agent directly to the Kyoto Encyclopedia of Genes and Genomes (KEGG). Query complex biological data—including metabolic pathways, gene sequences, chemical compounds, and drug interactions—using natural language. It provides the gold standard for bioinformatics exploration without manual database calls.

## Overview
- **Category:** databases
- **Price:** Free
- **Tags:** bioinformatics, genomics, metabolic-pathways, drug-interactions, molecular-biology

## Description

Need to map out a signaling cascade or check how two drugs might clash? This MCP gives your AI agent direct access to KEGG, one of the most important biological databases available. Instead of writing complex REST API calls and dealing with multiple data endpoints, you just ask your client a question about genes, pathways, or chemistry. For example, if you're investigating novel drug targets, you can use the system to check for adverse drug-drug interactions before running any wet lab work. It handles everything from identifying basic gene identifiers to converting them across major databases like UniProt and NCBI. When your workflow hits a biological data wall—say, you need metabolic pathway details that aren't in standard literature—this MCP makes sure the information is right there for your agent to process. Because Vinkius hosts this entire catalog of tools, connecting KEGG means all your core systems biology needs are covered from one place.

## Tools

### kegg_conv
Converts a gene or compound identifier between KEGG and external databases like NCBI.

### kegg_ddi
Searches for known adverse drug-drug interactions using chemical names or IDs.

### kegg_find
Queries the database to locate entries based on general keywords or specific chemical data points.

### kegg_get
Retrieves full, detailed records for a specified gene or pathway in various file formats.

### kegg_info
Displays system-level statistics and release information for the KEGG database itself.

### kegg_link
Identifies related biological entries by mapping cross-references between different pathway types.

### kegg_list
Generates a comprehensive list of available identifiers, such as all known organisms or pathways.

## Prompt Examples

**Prompt:** 
```
List all available organisms in the KEGG database.
```

**Response:** 
```
I'll use `kegg_list` with the 'organism' parameter. I've found thousands of entries, including 'hsa' (Homo sapiens), 'mmu' (Mus musculus), and 'eco' (Escherichia coli). Which one would you like to explore?
```

**Prompt:** 
```
Search for compounds with the keyword 'glucose' in the KEGG database.
```

**Response:** 
```
Running `kegg_find` for 'glucose' in the compound database... I found several entries, including C00031 (D-Glucose) and C00221 (beta-D-Glucose). Would you like the full details for C00031?
```

**Prompt:** 
```
Get the full entry details for the human gene hsa:10458.
```

**Response:** 
```
Fetching data using `kegg_get` for 'hsa:10458'... This gene corresponds to ACSL4 (acyl-CoA synthetase long chain family member 4). It is involved in metabolic pathways like Fatty acid metabolism (hsa00071).
```

## Capabilities

### Map biological pathways
Find related entries and visualize how genes link up into larger metabolic or signaling networks.

### Analyze drug interactions
Identify specific adverse reactions between drugs, which is critical for pharmacology research.

### Retrieve gene details
Fetch detailed records on genes, proteins, and associated organisms from the KEGG database.

### Search chemical data
Find entries for compounds, glycans, or drugs using keywords or specific formulas.

### Standardize identifiers
Convert a gene ID from one database (like NCBI) into another required format for analysis.

## Use Cases

### Identifying a potential drug target
A pharmacologist needs to know if Drug A and Drug B interact poorly. They prompt their agent: 'Check for interactions between X and Y.' The agent calls `kegg_ddi` and returns the specific adverse interaction details, saving hours of literature review.

### Validating a gene sequence
A researcher finds an unknown gene ID. They use the MCP to first run `kegg_list` to check if the organism is supported, then call `kegg_get` using the ID to retrieve all associated protein and metabolic pathway data.

### Mapping a novel metabolism
A systems biologist wants to see how an enzyme fits into known pathways. They use `kegg_link` to find related entries, then run `kegg_find` with the compound name to validate its inclusion in the pathway model.

### Cross-referencing datasets
A data scientist has a list of gene IDs from UniProt. They use `kegg_conv` first to standardize those IDs into KEGG format, allowing them to then pass the clean list to `kegg_get` for bulk data retrieval.

## Benefits

- You stop writing custom scripts to manage cross-database IDs. Use `kegg_conv` to automatically convert identifiers between KEGG, NCBI, and UniProt in a single chat command.
- Drug safety checks become trivial. Instead of consulting multiple pharmacology manuals, call `kegg_ddi` to find adverse drug interactions immediately for any pair of compounds.
- Instead of browsing complex web interfaces, you get structured data on demand. Use `kegg_get` to pull full details about a specific gene or pathway right into your analysis context.
- Build out your knowledge base with precision. Run `kegg_list` when you need an inventory—for example, listing every known organism the database tracks for comparison.
- Map entire biological systems without guessing connections. The `kegg_link` tool finds related entries automatically, helping trace metabolic pathways that might otherwise be missed.

## How It Works

The bottom line is that you get highly structured, validated biological metadata returned right into your chat interface, eliminating manual API integration steps.

1. Connect your AI client to the KEGG MCP through Vinkius.
2. Ask your agent a direct question, like 'What are the metabolic pathways linked to human gene X?'
3. The agent uses the appropriate tool internally and returns the structured biological data directly in the conversation.

## Frequently Asked Questions

**How do I find out what organisms are in KEGG using kegg_list?**
You run `kegg_list` and specify 'organism' as the parameter. This gives you a list of all available biological species tracked by the database, like Homo sapiens or Mus musculus.

**Can I use kegg_ddi to check drug interactions for something not in KEGG?**
No; `kegg_ddi` requires compounds that are already cataloged within the KEGG system. If your compound is novel, you'll need to search for its chemical data first using `kegg_find`.

**What do I use kegg_conv for?**
`kegg_conv` handles identifier translation. You pass it an ID from one system, and it reliably spits out the corresponding ID used by KEGG or other required external databases.

**Does kegg_get give me all possible data formats?**
Yes. `kegg_get` retrieves detailed records for a gene or pathway, and it is designed to output the information in various structured file types that your agent can easily ingest.

**How do I check the database status or release version using kegg_info?**
The `kegg_info` tool displays system metadata, including the current database release date and statistics. This confirms if you're working with the most recent dataset available for analysis.

**What is the purpose of using kegg_link to find related entries?**
`kegg_link` finds associated data points using cross-references across different biological systems. You use it specifically when you need to map relationships, like linking a human gene to its relevant metabolic pathway.

**Can I use kegg_find for chemical structures or just general keywords?**
`kegg_find` is flexible; it accepts multiple inputs. You can search using general keywords, specific formulas, or even molecular masses to pinpoint matching entries across various compound databases.

**Does kegg_get provide raw data formats for genes and proteins?**
Yes, `kegg_get` retrieves comprehensive database entries in structured or flat-file formats. This allows you to pull detailed gene sequences or full protein information directly into your analysis pipeline.

**How can I find all metabolic pathways associated with a specific human gene?**
You can use the `kegg_link` tool. Specify 'pathway' as the target_db and the human gene ID (e.g., 'hsa:10458') as the source_db to retrieve all linked biological pathways.

**Can I search for chemical compounds using an exact molecular mass?**
Yes! Use the `kegg_find` tool with the database set to 'compound', the mass value as the query, and the option set to 'exact_mass'.

**Is it possible to check for interactions between multiple drugs at once?**
Absolutely. Use the `kegg_ddi` tool and provide the drug identifiers separated by a '+' sign (e.g., 'D00564+D00017') to find known adverse interactions.