# Elasticsearch Vector MCP MCP

> Elasticsearch Vector gives your agent full control over semantic discovery and vector search within Elasticsearch. You can perform raw K-Nearest Neighbors (kNN) computations against multi-dimensional embedding arrays, manage complex index mappings, and ingest large volumes of embedding documents directly from any AI client.

## Overview
- **Category:** brain-trust
- **Price:** Free
- **Tags:** vector-search, knn-search, embeddings, semantic-search, indexing, ai-infrastructure

## Description

This MCP connects your entire workflow to an Elasticsearch cluster, giving you deep control over vector search and semantic data. Instead of relying on basic keyword matching, you can map absolute semantic similarity across huge datasets using dense vector embeddings. The system allows you to manage the underlying structure—creating new indices, checking mappings, and even cleaning up old records by UUID. When your agent needs to find contextually related information from raw unstructured text or images, it handles those complex calculations for you. This makes the process of turning data into actionable knowledge much more direct. You can connect this power through Vinkius, giving any MCP-compatible client immediate access to sophisticated vector capabilities.

## Tools

### create_index
Builds a new index specifically for storing dense vector data.

### delete_document
Removes a specific document from an index using its unique ID.

### get_index
Retrieves detailed information and mappings for a single, specified index.

### index_document
Adds or updates an existing document by attaching its embedding vector to the index.

### list_indexes
Lists every available index within the cluster, helping you see what data stores are active.

### search
Performs a dense vector kNN search to find documents most similar to your input vector.

## Prompt Examples

**Prompt:** 
```
Perform a kNN search in index 'product-embeddings' with vector [0.1, 0.2, ...]
```

**Response:** 
```
Searching 'product-embeddings'... I found the top 5 most similar documents. Result #1: 'Leather Backpack' (Similarity: 0.98). Result #2: 'Canvas Tote' (Similarity: 0.92). Would you like the full metadata for these results?
```

**Prompt:** 
```
Create a new vector index 'image-features' with 512 dimensions
```

**Response:** 
```
Index created! 'image-features' is now initialized with 512 dimensions and is ready for document ingestion. I can now help you index your first embedding document.
```

**Prompt:** 
```
List all vector indexes in my cluster
```

**Response:** 
```
Retrieving indexes... I found 3 vector-enabled indexes: 'product-embeddings' (1536 dims), 'image-features' (512 dims), and 'text-semantic-v1' (768 dims). Which one would you like to inspect?
```

## Capabilities

### Run kNN Similarity Searches
Find documents that are semantically closest to a given vector by performing raw K-Nearest Neighbors calculations.

### Manage Vector Index Structures
Create, list, and check the metadata of specific indices designed to store high-dimensional embedding vectors.

### Ingest Embedding Documents
Bulk insert new data by attaching exact dense vector payloads into the physical Lucene partitions.

### Clean Up Records
Permanently delete documents from indexes using specific UUID identifiers.

### Verify Index Schema
Retrieve detailed mapping rules and dimensional constraints for an index to ensure data readiness.

## Use Cases

### Retrieving context for an LLM agent
An ML engineer needs to ground a new chatbot on proprietary documents. The agent runs `list_indexes` to find the correct corpus, then uses `search` with a query vector to pull only the most relevant text snippets, avoiding irrelevant noise.

### Updating product metadata
A software developer uploads new product images and embeddings. They run `create_index` for the image features, then use `index_document` repeatedly to ensure all old records are correctly updated with the new vector data.

### Auditing sensitive data
An Ops Team member needs to remove a user's profile entirely. They use `get_index` first to confirm the schema, then call `delete_document` using the target UUID to ensure complete and irreversible removal.

### Testing new embedding models
A Data Scientist wants to test a brand-new vector model. They use `create_index` for the test data, then run multiple `search` calls with different vectors to compare performance against the existing production index.

## Benefits

- Find out exactly what's wrong with your search setup. Using the `get_index` tool lets you analyze dimensional constraints and index mappings before running a query.
- Stop wasting time on manual data cleanup. The `delete_document` tool allows you to instantly vaporize records from physical indices by providing exact UUIDs.
- Quickly see all available storage namespaces in your cluster. Running `list_indexes` gives you an instant inventory of every vector-enabled index.
- Ingest new knowledge without writing bulk scripts. The `index_document` tool lets your agent attach a dense vector payload to persist data instantly.
- Perform deep similarity checks with the `search` tool, running raw kNN computations that go far beyond simple keyword matches.

## How It Works

The bottom line is you get programmatic access to industrial-strength vector search without writing a single query language command.

1. Subscribe to this MCP, then input your Elasticsearch Host URL and API Key. These credentials control the connection.
2. Your agent sends a command—for example, 'Search for similar items in index X.' The tool handles the complex vector math.
3. The system returns a list of results, including similarity scores and metadata, that your agent uses to continue the conversation.

## Frequently Asked Questions

**How does the `search` tool perform vector lookups?**
The `search` tool executes raw kNN computations against your specified index. It takes a dense vector as input and returns documents with high similarity scores, helping you find semantically related data.

**What is the difference between `index_document` and `create_index`?**
`create_index` builds the empty container—the index itself. You must run this first. Then, you use `index_document` to fill that container by adding actual embedding payloads.

**I need to remove data; should I use `delete_document`?**
Yes, if you know the exact UUID of the document you want gone, `delete_document` performs an immediate and hard removal from the index. This is irreversible.

**How do I check which indexes are available for search?**
You call `list_indexes`. This provides a complete list of all vector-enabled storage namespaces currently managed by your cluster, letting you know what data sources exist.

**When calling `get_index`, how can I verify the dimensional constraints and schema rules for my vector data?**
The tool reports the index's mapping structure. It lists specific required dimensions, ensuring your embeddings adhere to the exact numeric format before you try to write them.

**If I need to process hundreds of documents, is there an efficient way to use `index_document`?**
Yes, while `index_document` handles single writes, sending data in bulk operations significantly boosts performance. This method allows you to attach many embedding payloads synchronously.

**What authentication credentials do I need when using tools like `list_indexes`?**
You must provide your Elasticsearch Host URL and a valid API Key. These keys are generated within Kibana under the Stack Management security settings, giving your agent read access.

**If I attempt to use `delete_document` with an incorrect UUID, what does that mean for my workflow?**
The operation will fail gracefully and return a specific error status. This allows your AI client to catch the failure immediately and continue processing without interruption or system crash.