Vald MCP for AI. Run Semantic Search on Vector Embeddings

Q: What is the difference between searchvectors and standard database search?

searchvectors doesn't look for keywords; it finds conceptual matches. It uses vector math to determine how semantically close one piece of data is to a query, giving you true contextual relevance.

Q: What if I need to remove a single piece of data? Should I use deletevector?

Yes. If you need to permanently purge a record, always run deletevector. It's irreversible and ensures the ID is removed from all future searches.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

Vald connects your AI agent to a high-speed, distributed vector knowledge base. It lets you perform approximate nearest neighbor (ANN) searches across millions of embedded data points directly from your conversational workflow.

Use it to query vectors with `search_vectors`, manage indices with `insert_vector` and `update_vector`, or check the cluster health using `get_engine_info`.

It's built for ML engineers who need reliable, deep context retrieval.

What your AI can do

Get engine info

Retrieves operational information and checks the current health status of the Vald engine cluster.

Get vector details

Pulls the raw vector data for a specific record ID so you can inspect its dimensions or values.

Insert vector

Inserts a brand new vector into the Vald index, requiring both a unique ID and the full embedding array.

+ 3 more capabilities included

Semantic Search

You query the Vald index using your agent and get back the most semantically similar vectors from millions of records.

Add Vectors to Index

Your agent calls insert_vector to add a new vector, complete with a unique ID, directly into the Vald cluster for future retrieval.

Modify Existing Records

You use update_vector to change an existing record's embedding array without disrupting active connections or queries.

Remove Corrupted Data

Your agent executes delete_vector when it needs to permanently purge a vector from the index, making sure it can't be found again.

Check Cluster Status

You call get_engine_info to retrieve operational data and confirm that the entire Vald cluster is healthy and accepting requests.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

Vald: 6 Tools for Vector Index Management

These six tools give your agent full control over the lifecycle of your embedded data—from searching to deleting.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Vald on Vinkius

Get Engine Info

Retrieves operational information and checks the current health status of the Vald engine cluster.

Get Vector Details

Pulls the raw vector data for a specific record ID so you can inspect its dimensions...

Insert Vector

Inserts a brand new vector into the Vald index, requiring both a unique ID and the...

Delete Vector

Permanently removes a specified vector from the Vald index. This action cannot be...

Update Vector

Replaces an existing vector record with new data by providing the original ID and...

Search Vectors

Performs a nearest neighbor search using a query vector, returning the most similar vectors in the index.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Vald integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "vald": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Vald tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"vald": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Vald, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Vald. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 6 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Searching a knowledge base shouldn't feel like digging through file cabinets.

Today, if your agent needs context from a massive document library, it often has to run multiple database queries or rely on keyword matching. This is slow. You copy-paste metadata; you wait for the system to filter by date *and* department *and* topic. It's clicks, tabs, and half an hour of waiting.

With Vald MCP Server, that process vanishes. Your agent just sends a query array or runs `search_vectors`. The engine handles the millions of dimensions instantly and returns only what's semantically relevant—no manual filtering required.

Using Vald to manage vector indices with `insert_vector`.

The old way involved writing complex, dedicated scripts just to append a new piece of knowledge. You'd have to handle ID conflicts and format the data for the backend terminal—a tedious setup every time you added a source document.

Now, calling `insert_vector` is enough. Your agent handles the whole lifecycle: it takes the raw content, generates the embedding, and pushes it into Vald with one command. You're back to just asking questions.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

Vald connects your agent to a high-speed, distributed vector knowledge base. It lets you perform approximate nearest neighbor (ANN) searches across millions of embedded data points right from your conversational workflow. It's built for ML engineers who need deep context retrieval that doesn't stall the whole system.

The Vald MCP Server exposes tools to manage massive collections of embeddings, treating them like a lightning-fast knowledge graph instead of traditional database tables. You interact with it using ` and Bold functions exposed through your AI client.

When you need deep context for an answer, you use search_vectors. This function takes a query vector from your agent and instantly pulls back the most semantically similar vectors stored in the index. It doesn't just find keywords; it finds concepts that mean the same thing. If you want to inspect what raw data belongs to a specific record after a search, you can call get_vector_details, passing in a unique ID so your agent pulls the full dimensional array for inspection.

Managing the data itself is straightforward. To add brand new knowledge, your agent uses insert_vector. This requires two things: a guaranteed unique ID and the complete embedding array. The cluster accepts this data immediately for future retrieval. If you change an existing record's meaning or context, you don't have to recreate it; you just call update_vector, providing both the original ID and the replacement embedding array.

This changes the vector without disrupting any active connections or queries.

Cleanup is equally easy. When data gets corrupted or becomes irrelevant, your agent executes delete_vector. You pass a specific ID, and the server permanently purges that vector from the Vald index—it’s gone for good. For system reliability, you check the entire infrastructure status by calling get_engine_info. This function retrieves operational data and confirms that the whole Vald cluster is healthy and ready to accept requests.

You've got all the tools here to keep your knowledge base accurate, fast, and robust.

Built · Hosted · Managed by Vinkius Vald MCP Server - Manage Vector Embeddings

Server ID 019d761a-af6b-704b-90dc-6ac85da2dba3

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Here's how it actually works

The bottom line is: Your AI agent becomes the direct API layer to a massive, high-speed knowledge graph that was previously hard to access.

Subscribe to the Vald server on Vinkius Marketplace. Then, plug in your specific Vald Gateway Host address.

Your AI client sends a command—say, running search_vectors with a query array—to the MCP endpoint.

Vald processes the request against its distributed engine and returns the nearest neighbor vector data to your agent.

Who is this actually for?

This is for the ML Engineer who needs to validate embedding changes on the fly without writing setup scripts. It's also for the Data Scientist stuck in a Jupyter notebook, needing to run 'top-k' semantic queries directly from an IDE. Or maybe the Ops Engineer who just wants to check if the whole cluster is up by asking it naturally.

Machine Learning Engineer

You use this server to test and visualize embedding changes against a live Vald instance without having to write custom scripts.

Data Scientist

You execute on-the-fly search_vectors calls directly from an IDE environment to validate your search recall results immediately.

DevOps Engineer

You check the active engine health status and cluster info using natural language when you suspect anomalies, running get_engine_info instantly.

What Changes When You Connect

You get instant, deep context retrieval. Instead of relying on simple keywords, running search_vectors uses semantic matching to pull relevant data points from millions of vectors.

Data cleanup is fast and controlled. Need to purge corrupted or outdated embeddings? Use delete_vector to remove specific records permanently without touching the rest of your index.

You maintain a clean system state by running get_engine_info. This lets you check cluster health and node details via natural language, which saves you from manually checking dashboards when things go sideways.

Updating context is easy. Use update_vector to swap out an old embedding representation for a new one, keeping your RAG pipeline current without downtime or complex database writes.

You gain full visibility into raw data structures via get_vector_details. This lets you verify the exact dimensions and float values of any vector by simply querying its ID.

See it in action

01 01

Validating RAG context in real-time

A Data Scientist needs to confirm if a new knowledge document embeds correctly. Instead of running slow scripts, they ask their agent: 'Search for vectors related to Q3 sales using this query vector.' The agent runs search_vectors and instantly validates the search recall results.

02 02

Migrating old data records

A Backend Developer finds a legacy record with an outdated embedding. They use get_vector_details to pull the original ID, then run update_vector with the newly generated array. This updates the context source without needing terminal access.

03 03

Debugging cluster instability

An Ops Engineer notices latency spikes. They ask their agent: 'What's the status of the Vald cluster?' The agent runs get_engine_info, immediately diagnosing if it’s a node issue or just a temporary glitch.

04 04

Curating knowledge bases

An ML Engineer wants to test how an embedding change affects search results. They use insert_vector to add the new vector, then immediately follow up with a search_vectors call to compare the output against the old data.

The honest tradeoffs

Treating vectors like SQL rows

Anti-pattern

Trying to run complex JOINs or WHERE clauses on vector metadata. You can't just use a simple select * from table where id = X because the data is high-dimensional.

The Fix

You don't query by traditional fields; you search for similarity. Use search_vectors with a target query array to find nearest neighbors, or use get_vector_details if you only need the raw vector data for ID lookup.

Ignoring cluster health

Anti-pattern

Assuming that because your application code runs fine, the underlying Vald engine is stable. You might get weird search results without knowing why.

The Fix

Before building anything, always check the system status. Run get_engine_info to confirm all nodes are active and the cluster is healthy.

Updating data manually

Anti-pattern

Copying vector IDs or embedding arrays into a spreadsheet and trying to re-upload them one by one, which is slow and error-prone.

The Fix

Use update_vector via your agent. You provide the existing ID and the new array in one call; Vald handles the atomic replacement.

When It Fits, When It Doesn't

Use this server if your primary problem involves context retrieval, semantic search, or managing high-dimensional embeddings (vectors). If you are building a RAG pipeline that needs to pull nuanced data from millions of records—not just searching by exact text match—you need Vald. Don't use it if all you need is simple key/value storage or structured SQL querying; use a standard database service for that. But if your source material is massive, unstructured, and requires deep contextual understanding (like large codebases or long documents), then the vector management tools like search_vectors, insert_vector, and update_vector are exactly what you need.

Questions you might have

How do I check if my Vald cluster is running correctly using `get_engine_info`? +

Just tell your agent, 'Check the status of the Vald cluster.' The tool runs get_engine_info and returns operational details, confirming things like node health and overall availability.

What is the difference between `search_vectors` and standard database search? +

search_vectors doesn't look for keywords; it finds conceptual matches. It uses vector math to determine how semantically close one piece of data is to a query, giving you true contextual relevance.

Can I modify vectors that are already in the index using `update_vector`? +

Yes. You provide the existing ID and the new vector array, and Vald replaces the old representation entirely. It's designed for non-disruptive updates.

What if I need to remove a single piece of data? Should I use `delete_vector`? +

Yes. If you need to permanently purge a record, always run delete_vector. It's irreversible and ensures the ID is removed from all future searches.

What information must I provide when running `insert_vector`? +

You need two things: a unique ID and the vector data formatted as a JSON array. The ID acts as the primary key, ensuring every record is instantly retrievable by its name.

If I run `search_vectors` repeatedly on millions of records, will it slow down? +

No. Because Vald uses an Approximate Nearest Neighbor (ANN) engine, performance remains fast even with massive datasets. It's designed for high-speed retrieval across huge vector spaces.

What happens if I try to insert a vector using `insert_vector` that shares an ID with another record? +

The system expects unique IDs. If you attempt to use an existing ID, the operation will fail or overwrite data, depending on your client's logic. Use update_vector if you just need to change the embedding.

What format is the raw vector data I get from `get_vector_details`? +

The output is always a pure, high-dimensional array of floating-point numbers. This isn't human-readable text; it’s the mathematical embedding that defines the record’s meaning.

Can my AI agent do a semantic search across my vector database? +

Yes! Provided you supply the embedded query vector, your agent can issue a vector search command to the Vald Engine. It will rapidly scan millions of indexes natively using its ANN algorithms and return the top-K closest neighbors associated with your data.

How do I ensure my Vald cluster is healthy right from my CLI? +

Skip complex diagnostics loops. Instruct your agent to get Vald internal engine info. It will interface directly via gRPC/REST and pull down cluster metrics including operational status, agent versions, and basic diagnostic health. This is vital for MLOps managing production RAG pipelines needing constant reassurance.

Can I permanently purge a corrupted vector embedding? +

When a document becomes stale in your knowledge base, you must remove its embedding. Ask the AI agent: permanently delete vector ID 'doc-xyz'. Using the removeVector capability, it targets your cluster and ensures the outdated semantic representation is fully expunged without risking other node data.

Connect to your AI in seconds.

Get engine info

Get vector details

Insert vector

Vald: 6 Tools for Vector Index Management

Make your AI actually useful.

Get Engine Info

Get Vector Details

Insert Vector

Delete Vector

Update Vector

Search Vectors

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Searching a knowledge base shouldn't feel like digging through file cabinets.

Using Vald to manage vector indices with `insert_vector`.

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Validating RAG context in real-time

Migrating old data records

Debugging cluster instability

Curating knowledge bases

The honest tradeoffs

Treating vectors like SQL rows

Ignoring cluster health

Updating data manually

When It Fits, When It Doesn't

Questions you might have