# Pinecone MCP

> Pinecone MCP Server gives your AI agent full control over your vector databases. Use this server to query embeddings, check index health, list collections, or delete vectors—all via natural language chat. It lets you manage complex knowledge graphs and run semantic searches without writing boilerplate code or leaving your IDE.

## Overview
- **Category:** loved-by-devs
- **Price:** Free
- **Tags:** semantic-search, vector-embeddings, knowledge-graph, high-performance, ai-infrastructure

## Description

Listen up. This Pinecone MCP Server gives your agent full control over your vector databases. You're talking about querying embeddings, checking index health, listing collections, or deleting vectors—all done via natural language chat. It lets you manage complex knowledge graphs and run deep semantic searches without needing to write boilerplate code or jumping out of your IDE.

When you connect this server, you gain a suite of tools that let your AI client interact directly with Pinecone's operational layer. You don't just ask questions; you make database changes. Here’s what it lets you do:

**Discovery & Mapping:**
You can start by listing every single active vector index in your account using the `list_indexes` tool. This shows you a quick rundown of all the knowledge bases you've set up. If you need to see historical groupings, you run `list_collections`, which returns names of snapshot collections holding grouped versions of your data over time. For any specific index you find, you can check its exact configuration—its schema, dimensions, and metadata requirements—by calling `describe_index`. This means you know precisely what structure the data in that index expects before sending anything to it.

**Retrieval Operations:**
Want to search for something? You use `query_vectors`. By passing an array of query embeddings, this tool finds the most semantically similar vectors and pulls back their associated metadata. It's not just a keyword match; it’s finding concepts that mean the same thing. If you already know the unique IDs of the records you need—maybe they came from another process—you skip the similarity search and use `fetch_vectors` to grab those specific vectors directly from an index.

**Monitoring & Auditing:**
You gotta keep track of your resources, right? To check how much capacity you're using or if you're running low on space, call `get_index_stats`. This pulls real-time usage metrics for any given index, giving you the vector count and pod capacity limits. This is crucial for knowing when you need to scale up your environment before things break.

**Maintenance & Cleanup:**
Sometimes you gotta clean house. If you find old or irrelevant data records cluttering an index, `delete_vectors` lets you perform surgical cleanups. You must confirm the vector ID and the collection name first; it’s a controlled deletion process so you don't mess up anything important.

Essentially, your agent doesn't just read from Pinecone; it operates within it. It maps out the structure with `list_indexes` and `list_collections`. It validates the schema using `describe_index`. It executes complex searches with `query_vectors` or retrieves known data points via `fetch_vectors`. It tracks usage limits by pulling stats with `get_index_stats`. And it keeps things tidy by running cleanup jobs with `delete_vectors`. You get full, structured access to your entire vector database environment.

## Tools

### delete_vectors
Deletes specified vectors from an index after confirming the ID and collection name.

### describe_index
Retrieves configuration details, like dimensions, for a named vector index.

### fetch_vectors
Gets specific vectors from an index when you know their unique IDs.

### get_index_stats
Returns usage statistics, including vector count and pod capacity, for a specified index.

### list_collections
Lists all snapshot collections stored within your Pinecone environment.

### list_indexes
Retrieves the names of every active vector index in your account.

### query_vectors
Searches for and returns the most similar vectors and their metadata based on a query embedding.

## Prompt Examples

**Prompt:** 
```
Check the vector count stats for the index named `document-embeddings`.
```

**Response:** 
```
Index `document-embeddings` currently holds 45,920 vector records. Its mathematical dimension is locked at 1536 (typical OpenAI output), and the pod architecture is 90% full.
```

**Prompt:** 
```
Delete all vectors belonging to the user ID 'auth-abc123' namespace.
```

**Response:** 
```
Executed `delete_vectors` successfully. The cluster associated with 'auth-abc123' has been wiped from the index.
```

**Prompt:** 
```
List all existing collections created in my Pinecone environment.
```

**Response:** 
```
You have 2 active collection snapshots stored mapping to production: `backup-q1-2026` and `knowledge-base-staging`.
```

## Capabilities

### List Indexes
Shows all vector indexes currently set up in your Pinecone environment.

### Describe Index Schema
Retrieves the full configuration details—the schema, dimensions, and metadata requirements—for a specific index.

### Query Vectors by Similarity
Finds the most semantically similar vectors and their associated data by passing an array of query embeddings.

### Fetch Specific Vectors
Retrieves known, specific vectors from an index when you already have their unique IDs.

### Get Index Statistics
Pulls real-time usage metrics, including vector count and pod capacity limits, for any given index.

### List Collections
Lists all snapshot collections that hold grouped versions of your data over time.

### Delete Vectors
Removes specific vectors from an index, allowing you to clean up old or irrelevant data records.

## Use Cases

### Debugging Context Relevance (ML Engineer)
A developer suspects chunks from Index A aren't relevant enough. Instead of writing a test script, they prompt their agent: 'Run `query_vectors` on Index A with this embedding and tell me the top 5 results.' The agent executes the query and returns the data structure, letting the engineer pinpoint the failure point instantly.

### Auditing Storage Capacity (Data Custodian)
The platform architect needs to know if a specific index is hitting its capacity limit before migrating more data. They ask the agent: 'Check `get_index_stats` for the production index.' The agent runs the tool and reports the vector count and pod utilization percentage, preventing an outage.

### Building Multi-Tenant Agents (Agent Developer)
An agent needs to process data across several client namespaces. First, it uses `list_indexes` to find all available indexes, then runs `describe_index` on each one to validate the expected dimensions and schema before attempting any reads.

### Cleaning Up Old Records (Data Architect)
After a project phase ends, an index accumulates stale vectors. The architect uses `list_indexes` to confirm the correct target, then calls `delete_vectors` targeting specific IDs or namespaces, ensuring clean data retention and reducing costs.

## Benefits

- **Instant Schema Validation:** You don't have to guess what an index expects. Running `describe_index` gives you the exact configuration details before your agent tries to query it, saving hours of debugging time.
- **Conversational Debugging:** Forget writing boilerplate Python test scripts just to check semantic relevance. Asking your agent a simple question like 'What are the stats for X?' runs `get_index_stats` and gives you the answer immediately in chat.
- **Data Governance at Scale:** Need to clean up old vectors? Use `delete_vectors`. You can manage data lifecycle—deleting records belonging to specific IDs or namespaces—without leaving your conversational flow.
- **Structured Discovery:** Mapping your entire knowledge graph is easy. Run `list_indexes` first, then use `list_collections` to see all historical snapshots, keeping your data lineage clear.
- **High-Speed Retrieval:** When you need the best context, `query_vectors` handles the heavy lifting of finding semantically similar vectors and returning their metadata, making RAG pipelines faster and more reliable.

## How It Works

The bottom line is, you bypass writing client code and talk to your vector database like it's a simple API endpoint in a conversation.

1. First, supply your Pinecone API key to the MCP Server.
2. Next, prompt your AI client with a request—like 'What's the vector count for X index?' or 'Find things similar to Y.'
3. The agent runs the necessary tool (e.g., `get_index_stats` or `query_vectors`) and returns the structured data directly into the chat thread.

## Frequently Asked Questions

**How do I check which indexes are available with list_indexes?**
You simply ask your agent to run `list_indexes`. It returns a plain list of all the vector index names you've created in your Pinecone environment. This is always step one for discovery.

**What does describe_index do if I don't know my schema?**
Running `describe_index` pulls the full configuration details, including the required mathematical dimension and metadata structure. It tells you exactly what your index expects so subsequent queries won't fail.

**Can query_vectors find data if I don't know the ID?**
Yes. `query_vectors` is designed for similarity search. You provide an embedding, and it finds the top N most similar vectors based on mathematical distance, regardless of whether you knew their IDs.

**How do I clean up old data using delete_vectors?**
You must specify three things: the index name, a collection, and the specific vector ID(s). The agent uses `delete_vectors` to target only what you explicitly tell it to remove.

**How do I check the usage capacity or health metrics using `get_index_stats`?**
It returns real-time statistics on vector counts and pod utilization. This shows you if your index is nearing its capacity limit, so you can proactively manage storage before a service failure.

**What does `list_collections` show me about my stored backups?**
It lists all saved versions (snapshots) of your data structure. You use this to audit or roll back to a specific point in time, which is crucial for safe testing or compliance.

**If I know the exact vector ID, how do I use `fetch_vectors`?**
It pulls the precise metadata and embedding data associated with that single ID. This method bypasses similarity searches, offering faster retrieval when you need a specific record.

**How do I verify the required vector dimension size using `describe_index`?**
This tool provides the index's configured mathematical dimension. Checking this confirms compatibility with your embedding model and prevents data ingestion errors during setup.

**Can the AI execute raw vector similarity searches?**
Yes, absolutely. Once you supply the raw semantic embedding coordinates (normally a float array generated previously), the LLM can funnel it through the `query_vectors` tool. The Pinecone DB will process this and return the top-K closest vector matches along with embedded metadata.

**How do I check my remaining vector storage capacity?**
It's extremely simple. Just ask the connected AI agent to 'Get the index stats'. It will internally call `get_index_stats` against the specified index namespace, returning total vector count and physical dimensionality limits to your chat window.

**Is it safe to delete vectors dynamically using the chat terminal?**
Yes, but with standard precautions. The `delete_vectors` tool operates exactly as the official SDK. As long as you maintain clear contextual scopes and ID filtering in your prompts, the execution is purely deterministic and secure.