PubChem MCP. Structured Molecular Data Retrieval, Period.

Q: How do I find a compound by name using PubChem MCP Server?

You use searchpubchem. Just pass the common or IUPAC name you're looking for. It returns initial data and key identifiers like MW, which is usually enough to confirm what you're working with.

Q: What if I only have a molecular formula? Can PubChem MCP Server help?

Yes, use searchpubchemformula. You pass the exact formula (e.g., C8H10N4O2), and it finds all known compounds that match that composition.

Q: What is the best tool to get deep data for a specific compound?

getpubchemcompound is your go-to. You must feed it a PubChem Compound ID (CID) first. This tool pulls the deepest set of properties, including SMILES and InChI.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Just plug in your AI agents and start using Vinkius.

PubChem connects your AI agent to the world's largest open chemistry database, containing over 116 million compound records. This server lets you search for chemicals using common names, IUPAC nomenclature, or molecular formulas; it also retrieves deep data like SMILES strings, molecular weight (MW), and XLogP scores directly into your workflow.

What your AI agents can do

Get pubchem compound

Retrieves full molecular data—including formula, weight, SMILES, etc.—for a specific PubChem Compound ID (CID).

Search pubchem

Searches for chemical compounds using common names or synonyms and returns key identifiers like MW and XLogP.

Search pubchem formula

Finds all matching compounds when given a specific molecular formula (e.g., C8H10N4O2).

Search by Common Name

Find chemical compounds using their common names or synonyms.

Search by Molecular Formula

Identify all known compounds that match a specific molecular formula (e.g., C8H10N4O2).

Retrieve Full Compound Data

Pull detailed chemical information, including SMILES notation and physicochemical properties, using the PubChem Compound ID.

Ask AI about this MCP

Ask ChatGPT

Ask Claude

Ask Perplexity

Supported MCP Clients

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

+ other MCP clients

Free for Subscribers

Waiting for input…

AI Agent

PubChem MCP Server: 3 Tools for Chemical Data Access

Use these three tools to search chemical databases by name, formula, or ID, retrieving molecular data like SMILES and MW directly into your agent's output.

get019d75fa

get pubchem compound

Retrieves full molecular data—including formula, weight, SMILES, etc.—for a specific PubChem Compound ID (CID).

search019d75fa

search pubchem

Searches for chemical compounds using common names or synonyms and returns key identifiers like MW and XLogP.

search019d75fa

search pubchem formula

Finds all matching compounds when given a specific molecular formula (e.g., C8H10N4O2).

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with PubChem, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 4,700+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

What you can do with this MCP connector

Listen up. This server connects your AI client straight into PubChem—the biggest open chemistry database out there, sitting on over 116 million compound records. You don't need some headache API key to get at this data; you just use the tools right here.

How it works: Your agent calls one of these specific functions, and bam, it gets structured molecular data ready for your workflow. It's pure chemical intelligence, period. Forget wading through web pages; you get what you need directly.

When you need to find a compound using its common name or any known synonym, you use the search_pubchem tool. This function takes natural language identifiers—like 'aspirin' or some IUPAC-style synonym—and spits back key data points for matching compounds, including their molecular weight (MW) and XLogP scores.

If a name ain’t gonna cut it, you can use the search_pubchem_formula tool. Give it a specific molecular formula, like C8H10N4O2, and the server will identify all known compounds that match that exact composition. It's how you filter down the haystack when you only know the count of atoms you got.

When you already have a PubChem Compound ID (CID), you don't wanna waste time searching; you just want all the details on that molecule. That’s where get_pubchem_compound comes in. You feed it the CID, and it pulls back every piece of deep molecular data—you get the full chemical formula, the exact weight, the SMILES notation (that's the structure string), and more physicochemical properties.

It's a complete profile for that compound.

Think about this: If your workflow needs to track compounds across different search vectors, you don't need multiple integrations or complex database lookups. You just use the specialized tools we put here. Need common names? Use search_pubchem and pull those MW and XLogP scores immediately. Wanna check a formula against thousands of possibilities? Run search_pubchem_formula.

Got a specific ID you're working off of? get_pubchem_compound delivers the complete molecular blueprint, including that vital SMILES string. These tools act as direct pipelines to high-quality chemistry data, letting your agent work faster than anything else on this end.

How PubChem MCP Works

1 Start by directing your agent to use search_pubchem when you know a compound's common name or synonym.
2 If naming fails, use search_pubchem_formula with the molecular formula (e.g., C9H8O4) to narrow down possibilities.
3 Once an ID is confirmed, call get_pubchem_compound using that CID to pull every available data point.

The bottom line is: you use a search tool first to get candidate IDs, then pass those IDs to the detail retrieval tool for the final payload.

Who Is PubChem MCP For?

Medicinal chemists and biochemistry researchers need this. They're tired of hopping between Google Scholar, textbook diagrams, and separate database APIs just to confirm a compound's properties. This server lets them ask one question—like 'What are the drug-like characteristics of Aspirin?'—and get structured answers that plug straight into their code or reports.

Medicinal Chemist

Retrieves molecular properties and potential drug scaffolds for lead compound analysis.

Biochemistry Researcher

Identifies compounds based on formulas and maps out metabolite databases quickly.

Pharmacy Student

Looks up drug structures, formulas, and physicochemical data for coursework reports without manual lookups.

What Changes When You Connect

You get the full molecular picture in one go. Instead of searching for a compound name and then having to find its SMILES string somewhere else, get_pubchem_compound delivers everything—structure, weight, formula—in a single call.
Need to verify a structure by formula? Use search_pubchem_formula. You simply input the molecular recipe (like C9H8O4), and it returns all matching candidates. No manual filtering required.
Stop relying on vague text searches. The search_pubchem tool handles common names and synonyms, immediately giving you key metrics like XLogP and hydrogen bond counts for fast screening.
The data is structured for code. When your agent runs a query, the output isn't just text; it’s organized molecular records you can pass directly to other scripts or databases.
It works with real-world drug knowledge. The server handles complex compounds and validates them against global standards, letting you focus on chemistry, not data hygiene.

Real-World Use Cases

Validating a Drug Candidate's Structure

A chemist suspects a new lead compound has the formula C8H10N4O2. They run search_pubchem_formula to confirm candidates, identifying Caffeine (CID 2519). Next, they use get_pubchem_compound on CID 2519 to pull its full data payload, including MW and SMILES, for immediate comparison against known benchmarks.

Comparing Known Compounds

You need to compare the properties of Aspirin vs. Caffeine. You run search_pubchem twice—once for 'Aspirin' and once for 'Caffeine'. The agent retrieves both sets of data, allowing you to see key differences in XLogP or H-bond counts side-by-side without leaving your coding environment.

Handling Ambiguous Inputs

You only know the compound by a common name, like 'Glucose'. You use search_pubchem to query it. This returns initial data and IDs. If you need more than just the basic properties, you take those IDs and feed them into get_pubchem_compound for maximum detail.

Filtering a Large Library by Composition

A researcher is building a library of possible compounds. Instead of manually checking thousands of names, they use search_pubchem_formula with the target formula (e.g., C4H11N5) to pull every matching structure and then fetch their specific properties.

The Tradeoffs

Searching by vague text

Asking the agent, 'Tell me about this drug.' This is too general. It forces the agent to guess which compound you mean and risks pulling irrelevant data.

→ Always start with a specific tool call. If you know the common name, use search_pubchem('drug name'). If you have an ID, go straight to get_pubchem_compound(CID).

Skipping formula validation

Assuming a compound exists just because it's mentioned in text. You might waste time pulling data for the wrong structure.

→ When dealing with structural chemistry, validate first. Use search_pubchem_formula(CXYZ...) to confirm that the formula matches known compounds before requesting details.

Over-relying on a single search tool

Using only search_pubchem and missing crucial data fields like the full SMILES string or molecular weight because the basic search result was too minimal.

→ Use search_pubchem to get candidate IDs, then always follow up with get_pubchem_compound(CID) to guarantee you pull every available property.

When It Fits, When It Doesn't

Use this server if your work requires precise molecular data—specifically MW, SMILES notation, or the ability to query by chemical formula. You need it when drug discovery, synthesis planning, or biochemistry research are involved.

Don't use it if you simply need a general definition of a compound; for that, Wikipedia is fine. Also, don't rely on it if your primary input is an image—the tools require text identifiers (names, formulas, CIDs). If you only have the structure visually, you must first use another service to generate the SMILES or formula before querying these tools.

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by PubChem. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 3 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

get_pubchem_compound search_pubchem search_pubchem_formula

Manually checking molecular properties across multiple databases sucks.

Today, if you're comparing compounds—say, looking up three related metabolites—you manually jump between PubChem’s website, SciFinder, and your internal database. You copy the name into one place, then switch tabs to find the SMILES string in another. It takes time, and chances are, you miss a key data point like XLogP.

With this MCP server, you just prompt your agent: 'Compare the properties of Aspirin, Caffeine, and Glucose.' The server uses `search_pubchem` and `get_pubchem_compound` in sequence. You get structured JSON output that organizes all three compounds' MW, H-bond counts, and SMILES strings instantly.

The PubChem MCP Server: Get molecular data you need, when you need it.

You no longer have to remember which database holds the full set of physicochemical properties. The server handles this complexity for you. You tell your agent what you want—'Find all compounds matching C9H8O4'—and it runs `search_pubchem_formula` and delivers a precise, actionable list.

This means your workflow stays inside your coding environment. No more copy-pasting data between websites. The molecular properties are available in code variables the moment the agent finishes its task.

Common Questions About PubChem MCP

How do I find a compound by name using PubChem MCP Server? +

You use search_pubchem. Just pass the common or IUPAC name you're looking for. It returns initial data and key identifiers like MW, which is usually enough to confirm what you're working with.

What if I only have a molecular formula? Can PubChem MCP Server help? +

Yes, use search_pubchem_formula. You pass the exact formula (e.g., C8H10N4O2), and it finds all known compounds that match that composition.

What is the best tool to get deep data for a specific compound? +

get_pubchem_compound is your go-to. You must feed it a PubChem Compound ID (CID) first. This tool pulls the deepest set of properties, including SMILES and InChI.

Can I use `search_pubchem` for more than one compound? +

While you pass multiple names to the agent's prompt, you typically need to execute search_pubchem sequentially or gather individual results first before processing them all.

Do I need an API key to use the `search_pubchem` tool? +

No, you don't need an API key for any of the tools. Just connect your AI client and start running searches immediately. The server handles all authentication on Vinkius.

What happens if I use `get_pubchem_compound` with a non-existent CID? +

The tool will return an error message indicating that the specified PubChem Compound ID (CID) could not be found. You'll need to verify the ID or try searching by name instead.

Does `search_pubchem` handle ambiguous common names? +

Yes, it searches across 116M+ compounds and handles common names like 'aspirin' or 'caffeine'. The results will provide the most accurate molecular data available for that name.

Can I use `search_pubchem_formula` to find organic molecules outside of drug discovery? +

The tool finds compounds by any valid molecular formula, making it useful for general biochemistry research. It covers both pharmaceutical leads and foundational biological metabolites.

Do I need an API key to use PubChem? +

No. PubChem PUG REST is completely free and open without any authentication. The only limitation is a rate limit of 5 requests per second and 400 requests per minute, which is more than sufficient for conversational AI usage.

What molecular properties are returned for each compound? +

Each compound includes: CID, IUPAC name, molecular formula, molecular weight, canonical SMILES, InChI identifier, XLogP (lipophilicity), hydrogen bond donor count, hydrogen bond acceptor count, and molecular complexity score. These cover Lipinski's Rule of Five for drug-likeness assessment.

Can I search by molecular formula instead of name? +

Yes! Use the formula search tool with standard notation (e.g., C8H10N4O2 for caffeine, C9H8O4 for aspirin, H2O for water). PubChem will return all compounds matching that exact formula with their names and properties.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript