# Vertex AI Vector Search MCP

> Vertex AI Vector Search brings Google's massive vector matching power directly into your agent. You can search billions of semantic embeddings and manage complex index endpoints without leaving your chat window or IDE. This MCP lets you find related data by meaning, not just keywords, giving your LLM context based on deep geometric similarity calculations.

## Overview
- **Category:** ai-frontier
- **Price:** Free
- **Tags:** vector-search, embeddings, nearest-neighbor, semantic-matching, cloud-infrastructure, low-latency

## Description

Need to pull information that goes beyond simple keyword matching? Vertex AI Vector Search connects the power of Google Cloud's massive vector database directly to your agent. Instead of digging through console dashboards or writing complex API calls, you prompt your client and it handles the search. It takes a query—whether it’s a float array or text—and finds the most semantically similar data points across billions of records in low latency. You can also manage your infrastructure on the fly: ask to list all active vector indexes or check if an index is properly exposed for production traffic. This ability to administer and search massive datasets conversationally makes it indispensable, especially when connected through the Vinkius catalog.

## Tools

### get_index_details
Retrieves the specific metadata and configuration settings for a single vector index.

### list_deployed_indexes
Lists all vector indexes that have been successfully deployed to an active search endpoint.

### list_index_endpoints
Retrieves a list of every index endpoint configured within the current project.

### list_vector_indexes
Lists all vector indexes that exist in the entire Google Cloud project scope.

### list_vector_operations
Shows a timeline of long-running tasks, like index builds or updates, currently running in the cloud queue.

### search_nearest_neighbors
Performs a precise vector similarity search by inputting an endpoint ID, index ID, and a query vector array.

## Prompt Examples

**Prompt:** 
```
List all our active vector indexes on the current GCP project.
```

**Response:** 
```
I've scanned your infrastructure on Vertex AI. You currently have 2 core vector indexes active:
- `product_catalog_dense` (ID: 4858..., Dimensions: 768)
- `document_vault_prod` (ID: 3959..., Dimensions: 1536)

Would you like me to inspect endpoints verifying which of these are currently exposed for search?
```

**Prompt:** 
```
Check for any long-running vector deployment operations currently uncompleted.
```

**Response:** 
```
I've queried the Vertex AI Operations log.

There is currently `1` pending operation:
- **Operation ID**: `74803923`
- **Target**: Deploying index `document_vault_prod` to endpoint `112xyz`.
- **Status**: `RUNNING` (Approx 45% completion, started 22 minutes ago).

No errors or stalling reported on your cluster. I can poll this periodically if you wish.
```

**Prompt:** 
```
Find the 3 nearest neighbors mapping to endpoint '39xl' array index ID 'dep_30' using vector [-0.2, 0.5, 0.0].
```

**Response:** 
```
I've passed the floating-point vector directly to the Vertex instance. The semantic similarities fetched from `dep_30` are:
1. Base ID `prod-281x` (Distance: 0.12)
2. Base ID `prod-994y` (Distance: 0.17)
3. Base ID `prod-110m` (Distance: 0.31)

This indicates extremely tight clustering to item *#281x* in the embedding space.
```

## Capabilities

### Find nearest neighbors
Execute a vector similarity search using a query array against specific index endpoints to locate highly related data IDs.

### List all indexes
View every vector index defined within your entire Google Cloud project for an overview of available datasets.

### Check index configuration
Retrieve detailed metadata and current setup information for any single, specific vector index.

### Track deployment jobs
Monitor the status of multi-terabyte index builds or updates by listing long-running operational tasks in your cloud queue.

### Verify active endpoints
List all network endpoints that expose indexed data, confirming which indexes are currently ready to receive production search traffic.

## Use Cases

### Finding Context in a Massive Document Vault
A data scientist needs to check if their new document chunk is semantically similar to existing records. They simply prompt their agent: 'Find the top 5 matches for this vector.' The agent uses search_nearest_neighbors, returning relevant IDs and proximity scores instantly.

### Checking Infrastructure Health Before Launch
An MLOps engineer is deploying a new product catalog index. Instead of navigating the console, they ask their agent to run list_vector_operations. The agent replies with the status and ETA, confirming the deployment job is still active.

### Verifying Database Readiness
A backend architect needs to know if a specific index is ready for production traffic across multiple regions. They use list_index_endpoints, which confirms exactly which deployed index iteration is currently exposed and accepting requests.

### Inventorying All Available Data Sources
A developer starts a new project and needs to know what vector databases exist in the cloud. They run list_vector_indexes, immediately getting an overview of all available core indexes they can point their agent toward.

## Benefits

- Stop manually checking console logs. You can use list_vector_operations to monitor multi-terabyte index build progress right through chat.
- Instantly test your data structures. The search_nearest_neighbors tool lets you push experimental float arrays into production endpoints without writing boilerplate code.
- Know what's live and ready for traffic. Use list_index_endpoints to confirm which underlying deployed index versions are actually receiving requests.
- Get a full picture of your data assets. Running list_vector_indexes gives you an immediate inventory of every vector index in the entire project.
- Avoid configuration surprises. Calling get_index_details provides all the necessary metadata and setup info for any specific index, confirming its dimensionality and current state.

## How It Works

The bottom line is you get deep contextual knowledge from massive datasets without ever writing boilerplate search code.

1. First, you point your agent toward the specific index or endpoint ID and provide the query vector (a JSON array of floating-point numbers).
2. Next, the MCP executes a nearest neighbor lookup against the target Google Cloud resource, comparing the query vector to millions of stored embeddings.
3. Finally, your agent receives a list of top matching data IDs along with their calculated distance scores, showing how semantically close they are.

## Frequently Asked Questions

**How do I check if an index is ready to be searched with Vertex AI Vector Search?**
You must first use list_index_endpoints. This tool lists all active network endpoints, confirming which specific underlying deployed index iterations are currently receiving production traffic and can accept search queries.

**What is the difference between listing indexes and checking endpoint details with Vertex AI Vector Search?**
list_vector_indexes gives you a list of all existing datasets in the project. list_index_endpoints, however, tells you which of those indices are currently set up to be used for live search traffic.

**Can I monitor index build progress using Vertex AI Vector Search?**
Yes, use list_vector_operations. This tool lets your agent query the cloud queue and review persistent long-running task timelines, letting you know if a multi-terabyte build is still running or has failed.

**Do I need to manually write API calls for semantic search with Vertex AI Vector Search?**
No. Your agent handles the complex JSON formatting and endpoint calling. You just provide the query vector, and the MCP executes the search_nearest_neighbors call.

**What should I do if my index configuration seems wrong? (Vertex AI Vector Search)**
Run get_index_details for that specific index. This retrieves all metadata and configuration details, allowing you to verify its dimensionality or operational settings without guessing.