# LanceDB MCP MCP

> LanceDB MCP gives your agent full control over serverless vector storage and RAG infrastructure. You can perform high-accuracy similarity searches, provision new columnar tables with precise schemas, and ingest multi-modal embeddings—all through natural conversation. It’s how you manage complex vectorized data without writing manual Python scripts.

## Overview
- **Category:** loved-by-devs
- **Price:** Free
- **Tags:** vector-search, embeddings, rag, serverless-db, multi-modal, data-indexing

## Description

Managing structured vectors used for Retrieval-Augmented Generation (RAG) is usually a pain point. You're constantly scripting schema checks, running different similarity lookups, and manually managing which embeddings belong where. This MCP lets your agent handle all of that from natural conversation.

Instead of opening a local client or writing boilerplate Python code just to see what tables exist, you talk to the connection directly. Your agent handles connecting to the database endpoint, reading metadata, setting up schemas with specific Apache Arrow types, and keeping the underlying index current as you feed it new data. Because Vinkius hosts this MCP, you connect once from any compatible client in the catalog, giving your workflow a central point for all vector operations.

## Tools

### create_table
Builds a new LanceDB table structure with a defined schema.

### delete_table
Permanently removes an entire vector table from the database.

### get_table
Retrieves the precise schema and metadata for a specific existing table.

### insert_rows
Adds structured row payloads and vectors to a table, updating the ANN index in real time.

### list_tables
Lists all vectorized tables that currently reside within your LanceDB instance.

### vector_search
Runs an optimized K-Nearest Neighbor (KNN) search to find semantically related data.

## Prompt Examples

**Prompt:** 
```
List all active tables in my LanceDB instance
```

**Response:** 
```
I've retrieved your vector tables. Active tables include 'customer_docs', 'product_embeddings', 'support_kb', and 'user_logs'. Which one would you like to inspect the schema for?
```

**Prompt:** 
```
Perform a vector search in 'product_embeddings' for this vector: [0.1, 0.2, ...]
```

**Response:** 
```
Executing similarity search… I've found the 5 most relevant products. Highlights include 'Cloud-Native Sneakers' (Distance: 0.12), 'Edge Performance Runner' (Distance: 0.15), and 'Vector Pro Trainer' (Distance: 0.18). All results are now available in your workspace.
```

**Prompt:** 
```
Show me the schema for the 'support_kb' table
```

**Response:** 
```
Schema for 'support_kb': The table has 4 columns: 'id' (String), 'text' (String), 'vector' (Float32, Dimensions: 1536), and 'metadata' (JSON). It is currently using an IVF-PQ index for optimized ANN lookups.
```

## Capabilities

### Execute Vector Similarity Search
Find semantically related rows by running highly-optimized K-Nearest Neighbor lookups against existing embeddings.

### List and Inspect Tables
See every vectorized table in the database and retrieve its exact schema metadata, including vector dimensions.

### Create New Vector Schemas
Provision an entirely new columnar table, defining a precise Apache Arrow schema for your multi-modal AI data.

### Ingest Structured Payloads
Insert new structured rows and their corresponding vectors into existing tables, updating the underlying ANN index automatically.

### Clean Up Data Storage
Irreversibly delete entire vector tables to maintain a clean, optimized database environment.

## Use Cases

### Analyzing old customer support documentation
A developer needs to find all articles related to 'API rate limits' across three different knowledge bases. Instead of running three separate search scripts, they ask their agent: 'Perform a vector search on the support_kb table for API rate limits.' The agent executes `vector_search` and returns the top results.

### Building a new product catalog feature
An architect needs to store embeddings for newly digitized product manuals. They first tell their agent: 'Create a table named product_manuals with string IDs, text content, and vector fields.' The agent runs `create_table`, setting up the necessary structure before any data is inserted.

### Cleaning up stale experimental data
A data engineer finishes a test using temporary embeddings. They tell their agent: 'I'm done with the old experiment table, delete it.' The agent runs `delete_table`, ensuring that useless data doesn't bloat the system and slow down future searches.

### Inspecting a database before deployment
A team lead needs to verify the schema of an existing table called 'user_profiles'. They ask their agent, which runs `get_table`, providing immediate confirmation of the column types and vector dimensions without needing to query the API directly.

## Benefits

- Avoid manual scripting for retrieval. Use `vector_search` to get relevant, semantically related results instantly, no boilerplate needed.
- Define your data structure upfront. Use `create_table` to provision new vector tables with specific Apache Arrow schemas before any data goes in.
- Keep your data clean and optimized. When a table is retired, use `delete_table` to permanently vaporize it, avoiding clutter.
- Track everything easily. Run `list_tables` or `get_table` to see the exact schema and metadata of any existing vector repository.
- Process data in bulk. Use `insert_rows` when you have a batch of records ready; the tool updates the underlying index automatically.

## How It Works

The bottom line is, you manage complex vector storage using natural language commands instead of writing client-side database queries.

1. Subscribe to this MCP on Vinkius and provide your LanceDB API URL, API Key, and Database Name.
2. Your agent connects using those credentials. It validates the connection and maps out all available vector tables.
3. You instruct your agent what you want—like 'search for documents about quantum physics' or 'create a new table for product metadata'—and it executes the necessary operations.

## Frequently Asked Questions

**Can I perform a semantic similarity search using my agent?**
Yes. Use the `vector_search` tool by providing the target Table name and a JSON array of floating-point numbers representing your query embedding. Your agent will return the k-nearest rows from LanceDB based on semantic similarity.

**How do I create a new table with a specific Apache Arrow schema?**
The `create_table` tool allows your agent to initialize a new columnar vector table. You just need to provide the desired Table name and a valid Apache Arrow schema mapping in JSON format defining dimensions and scalar fields.

**Can my agent insert new embeddings directly into a LanceDB table?**
Absolutely. Use the `insert_rows` tool to persist new data rows containing native embeddings and arbitrary JSON metadata. Your agent will handle the payload delivery, and LanceDB will automatically update its ANN index.

**Using `list_tables`, how do I audit which vector tables are currently active in my LanceDB instance?**
It provides an immediate, comprehensive list of all existing table names. This helps you quickly verify your database's current resource footprint and scope before making any changes.

**What specific metadata can `get_table` provide for a vector table I plan to use?**
It delivers detailed schema information, including tensor dimensions, vector topologies, and the index type (like IVF-PQ). This is essential knowledge before running complex queries.

**If I run `delete_table`, can I recover the data, or is the loss irreversible?**
The deletion process is irreversible. The action vaporizes the entire table structure and all associated vectors and rows immediately. Use this only when you are certain the data must be purged.

**When I use `insert_rows`, does it guarantee that the underlying ANN index updates correctly?**
Yes, the process is designed for dynamic updating. The system handles inserting structured payloads and vectors while simultaneously refreshing all necessary components of the underlying ANN index.

**How can I provision a new table using `create_table` if my data has non-standard dimensions?**
You must declare the specific Apache Arrow schema and multi-dimensional layout when calling `create_table`. This strict definition ensures your vector storage is structured for optimal AI workloads.