# GenBank/NCBI API MCP MCP

> GenBank/NCBI API provides direct access to genomic data, letting you audit sequences, query protein metadata, and manage biological records from NCBI's authoritative databases. Your agent handles complex searches across nuccore, protein, and other sources, giving you UIDs and technical summaries without visiting a bioinformatics portal.

## Overview
- **Category:** artificial-intelligence
- **Price:** Free
- **Tags:** genomics, bioinformatics, protein-sequences, molecular-biology, data-retrieval, research-automation

## Description

Working with molecular biology data means juggling dozens of specific identifiers, each requiring verification against the source—a process that used to take hours just to scope out a project. Now, your AI agent handles it conversationally. You simply ask for what you need: 'What are the metadata and UIDs for human insulin in the protein database?' The system executes complex searches across NCBI's full catalog instantly. It retrieves sequence details, checks the status of the API, and even helps you determine which databases contain the information you need before running a search. You get verified, precise data without ever having to manually navigate a bioinformatics portal. Connect this MCP through Vinkius, and your agent acts like an instant genomic consultant, structuring raw biological research into actionable intelligence.

## Tools

### check_api_status
Confirms whether the NCBI E-utilities service is currently operational and available for use.

### get_ncbi_summary
Retrieves detailed technical and bibliographic summaries based on a provided NCBI sequence ID.

### list_ncbi_databases
Outputs a list of all the NCBI databases that are available for querying, helping you scope your project.

### search_ncbi_sequences
Searches for biological sequences across specified NCBI databases using a simple query term.

## Prompt Examples

**Prompt:** 
```
Search for 'human insulin' in the 'protein' database using NCBI.
```

**Response:** 
```
I've retrieved several matching records for human insulin! Notable UIDs include [UID 1] and [UID 2]. Would you like the full technical summary or the bibliographic metadata for any of these identifiers?
```

**Prompt:** 
```
Get the summary for NCBI UID '123456' in 'nuccore'.
```

**Response:** 
```
I've retrieved the summary for UID 123456! It is a nucleotide sequence titled [Title]. I can provide the publication date and caption metadata to help you identify it in your workflow.
```

**Prompt:** 
```
List all available NCBI databases.
```

**Response:** 
```
I've scanned the database catalog from NCBI! There are over 30 biological databases available, including PubMed, Protein, and Nucleotide. I can help you search for specific records in any of these thematic clusters.
```

## Capabilities

### Check API operational status
Verifies if the NCBI E-utilities service is currently running.

### List available databases
Displays a catalog of all NCBI databases, letting you know where to look for specific types of biological data.

### Search genomic sequences by term
Finds biological sequences across specified databases using simple search terms like 'human insulin'.

### Retrieve sequence metadata summary
Pulls the full technical and bibliographic details for a specific NCBI sequence ID.

## Use Cases

### Scope a new project's scope
A geneticist needs to know if they should look at protein data or nucleotide sequences. They ask their agent, which first runs `list_ncbi_databases` to show the full spectrum of available sources (like 'nuccore' and 'protein'), letting them immediately narrow down their focus.

### Verify a client-provided ID
A researcher receives a sequence ID from a collaborator. Instead of spending time cross-checking, they ask the agent to run `get_ncbi_summary` using that specific identifier, receiving the full title and publication date in seconds.

### Mass search for markers
A bioinformatician needs to find all instances of a specific enzyme across multiple databases. They instruct their agent to run `search_ncbi_sequences` using the enzyme name, which returns multiple UIDs they can then process further.

### Pre-flight system check
A team lead is about to start a large batch audit. They first trigger `check_api_status`. If the API is down, they stop work immediately, preventing hours of failed searches and corrupted data.

## Benefits

- Audit entire project scopes instantly. Use `list_ncbi_databases` to see every thematic cluster—from 'nuccore' to 'protein'—before you write a single search query.
- Skip manual data gathering steps. Instead of copying IDs and pasting them into separate forms, ask your agent for the technical summary using `get_ncbi_summary` and get the full context immediately.
- Keep your workflow running smoothly. Before starting any major audit, use `check_api_status` to ensure the NCBI service hasn't gone down or is experiencing throttling.
- Find needles in haystacks faster. Use `search_ncbi_sequences` to pull matching records for terms like 'human insulin,' bypassing the need to manually build complex search filters.
- Manage data integrity across platforms. Your agent handles cross-referencing UIDs and caption details, ensuring that the metadata you use is always official and verifiable.

## How It Works

The bottom line is you get verified genomic intelligence, delivered via natural language conversation.

1. Connect your AI client to this MCP. You don't need an API key because the underlying service is open and free.
2. Ask your agent to perform initial checks, such as listing all available NCBI databases or verifying the status of the connection.
3. The system executes a multi-step query using these tools and returns structured data—like UIDs or technical summaries—ready for your workflow.

## Frequently Asked Questions

**How do I list all possible NCBI databases using GenBank/NCBI API?**
Run the `list_ncbi_databases` tool. It returns a comprehensive catalog of every available database, so you know exactly where to look for your data.

**I found an ID; how do I get its full details using GenBank/NCBI API?**
Use the `get_ncbi_summary` tool. You just feed it the specific sequence ID, and it returns the technical title, publication date, and caption metadata.

**Does the search function handle different protein types? (search_ncbi_sequences)**
Yes. The `search_ncbi_sequences` tool accepts multiple database targets in its query, allowing you to search for a term across 'nuccore' and 'protein' simultaneously.

**What if I want to check the service before running a large audit?**
Use `check_api_status`. This tool immediately tells you if the NCBI E-utilities service is operational, saving you from hours of failed searches due to temporary outages.

**Do I need an API key or special credentials to use the GenBank/NCBI API?**
No, you don't need any keys. This MCP connects directly to NCBI's E-utilities, which is a free and open service. You just connect your AI agent to the Vinkius marketplace.

**What should I do if my high-volume research workflow encounters rate limits using search_ncbi_sequences?**
While the underlying service is robust, large batches might hit usage limits. For stable data retrieval, break your queries into smaller chunks or schedule automated checks using check_api_status to plan around potential throttling.

**How can I refine results after performing a broad search with search_ncbi_sequences?**
You include specific filters directly in the query term. Instead of just searching for 'protein', you narrow it down by adding functional terms or explicit database type identifiers to your request.

**Is the GenBank/NCBI API compatible with different AI clients and development environments?**
Yes, this MCP is designed to connect through any platform that supports the Model Context Protocol. You can use it in Claude, Cursor, Windsurf, or any other compliant agent.