R2R MCP. Query your private knowledge base directly from your AI client.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
R2R connects your AI agent directly to your private knowledge base. Execute vector searches, run precise Retrieval-Augmented Generation (RAG) queries, and manage documents without leaving your chat interface.
It lets your agent ask questions against proprietary data—not just general web knowledge.
What your AI agents can do
Get document
Retrieves full details for a single, specified document using its ID.
Get health
Pings the R2R server to confirm its operational status and readiness for use.
List collections
Returns a list of all distinct, organized document collections available in the system.
Performs semantic similarity searches across all ingested documents using the search tool.
Runs advanced RAG queries via rag_query, summarizing answers based on retrieved vector data chunks.
Retrieves specific metadata and content details for a known document using the get_document tool.
Lists all distinct document collections available in your R2R system via the list_collections tool.
Retrieves a list of every ingested document ID and its basic metadata using the list_documents tool.
Verifies if the R2R server is operational and ready to accept vector operations with get_health.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
R2R MCP Server: 6 Tools for Document & Vector Search
These six tools allow your AI client to interact directly with your R2R knowledge base. Use them to search, list metadata, and generate summaries from proprietary documents.
019d75fbget document
Retrieves full details for a single, specified document using its ID.
019d75fbget health
Pings the R2R server to confirm its operational status and readiness for use.
019d75fblist collections
Returns a list of all distinct, organized document collections available in the system.
019d75fblist documents
Lists metadata and IDs for every ingested file within a specified collection.
019d75fbrag query
Executes a specialized RAG query to summarize knowledge based on vector data retrieval.
019d75fbsearch
Performs a semantic search against the document database, finding contextually relevant chunks of text.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with R2R, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
This server connects your AI agent directly to your private knowledge base. You can run vector searches and execute precise Retrieval-Augmented Generation (RAG) queries without ever leaving your chat interface. It lets your agent ask questions against proprietary data—not just general web knowledge.
Checking the System Status
You start by verifying everything's running right with get_health. You call this tool to ping the R2R server, confirming its operational status and making sure it’s ready for vector operations. It gives you a clean confirmation that your connection is good to go.
Mapping Out Your Data Inventory
Before querying anything, you gotta know what data you're working with. You use list_collections when you need a rundown of all the distinct, organized document groups available in the entire system. This shows you the top-level categories of your knowledge. Next, if you select a specific collection, you run list_documents.
This tool lists metadata and IDs for every ingested file within that selected grouping, giving you an inventory count and basic details for everything stored.
If you need to pull the full context—the actual content and deep metadata for one specific piece of writing—you use get_document, passing in a known document ID. This retrieves all the detailed information attached to that single file record.
Performing Searches and Generating Answers
You've checked the inventory; now you need answers. For a foundational understanding, you can run a semantic search using the search tool. It performs a deep similarity search across every ingested document, finding contextually relevant chunks of text based on meaning, not just matching keywords. This is how your agent figures out what's related to what.
When you need more than just a few snippets, you run advanced RAG queries using rag_query. This specialized tool executes the full Retrieval-Augmented Generation process, summarizing answers directly from multiple retrieved vector data chunks. It gives your agent a synthesized answer based on deep knowledge extraction, which is way better than simple search results.
How You Use It in Practice
Your AI client calls one of these tools—like list_collections to see what's available first, or search when you know the topic. The R2R server processes that request against your private vector store and sends back the clean data, context, or summarized answer right into your chat workflow for final use by your agent.
It handles the complexity of proprietary data retrieval so you don't have to.
You control the entire process: checking status with get_health, scoping down your files using list_collections and then list_documents, pulling specific texts via get_document, finding relevant passages with search, or generating a comprehensive answer using rag_query. It keeps all this powerful, private data management right inside the chat interface.
How R2R MCP Works
- 1 You enable the R2R integration, providing your specific Base URL and Authorization Key.
- 2 Your AI client recognizes a user request that requires internal knowledge (e.g., 'What was our policy on X?').
- 3 The agent calls the appropriate tool (
searchorrag_query), which sends the request to the R2R server, returning a summarized, context-grounded answer.
The bottom line is: it makes your private document collection an active source of truth for your AI client, requiring zero external scripting.
Who Is R2R MCP For?
Engineers and technical staff who spend too much time context-switching between documentation systems, databases, and API tools. If you're tired of running a separate script just to ask an AI a question about your internal policies, this is for you.
Uses the server to query vector instances locally without needing Postman or writing external scripts.
Quickly verifies document ingestions and browses metadata directly inside the chat terminal, verifying what data is available for RAG.
Audits engine responses and fine-tunes semantic retrieval limits by calling specific tools like search and get_document.
What Changes When You Connect
- Bypass manual API calls: You don't need to switch to Postman or write boilerplate Python code just to check a document ID. Use the
get_documenttool and get details instantly. - Go beyond keywords: Instead of relying on simple keyword matching, use the
searchtool to perform semantic similarity queries. The agent finds answers based on meaning. - Verify your data sources immediately: Before running complex reports, call
list_collectionsorlist_documents. This lets you verify that the correct documents were ingested and are available. - Handle complex questions easily: When you need a summary of advanced concepts (like 'chunking strategies'), use
rag_query. It handles the retrieval-and-summary pipeline for you. - Immediate operational checks: Start every session by running
get_health. This confirms the R2R server is up and ready, saving time diagnosing connection issues.
Real-World Use Cases
Onboarding a New Employee
A new data custodian needs to know all internal policies. Instead of asking three different people for documents, the agent runs list_collections first. Then, it uses rag_query to summarize the 'Remote Work Policy,' instantly compiling a single answer from multiple sources.
Debugging an AI Knowledge Gap
An ML engineer suspects the AI is missing policy updates. They run list_documents for the 'HR Policies' collection to check the document IDs and metadata. If they don't see a recent file, they know where the data ingestion failed.
Getting Quick Policy Answers
A user asks, 'What are our holiday float days for 2026?' The agent runs search with that query. R2R finds the top three relevant text snippets from the correct policy document and presents them directly.
Pre-Flight Check on Data Integrity
A backend developer wants to ensure the system is ready for a major update. They first call get_health to verify status, then use list_collections to map out all current data silos before running any complex queries.
The Tradeoffs
Treating search and RAG as the same
Asking the agent to 'search for Q3 earnings' when they actually need a summarized answer about 'Q3 earnings impact on department X.' Simple search returns raw text snippets, which is overwhelming.
→
If you need synthesis, use rag_query. This tool takes the retrieved context and forces the server to summarize it. If you just need source material, use search.
Ignoring metadata checks
Assuming that because a document exists in your folder, the AI knows about it. The agent tries to query data from an old or unindexed file.
→
Always run list_documents first for a given collection. This confirms the specific document ID and current metadata before you attempt retrieval.
Skipping health checks
Relying on an agent call that fails mid-process, forcing the user to guess if the issue is the query or the connection itself.
→
Run get_health at the start of every session. It confirms the R2R server is actively accepting vector operations.
When It Fits, When It Doesn't
Use this MCP Server if your primary requirement is querying a stable, private knowledge base (internal policies, proprietary manuals, etc.). The flow is: Check health -> List collections/documents -> Search or Query. Don't use it if you need general web information; that requires an external search API. Also, don't use it for simple lookups by ID—use get_document. If your goal is just to confirm the server works, run get_health. Never assume a complex query can be solved with just a keyword search; always default to using rag_query when synthesis is needed.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by R2R. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 6 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Accessing internal knowledge shouldn't require three different dashboards and five manual API calls.
Today, if your AI needs policy information, you probably have to log into the document management system, find the right collection, manually verify the file ID, run a search query on that specific endpoint, and then copy-paste the results into the chat. It’s slow, it breaks easily, and it's exhausting.
With this MCP Server, you just ask your agent: 'What is the current policy?' The agent runs `search` or `rag_query` internally. You get a single, sourced answer—no dashboard hopping required.
RAG Query: Get summarized answers without building complex payloads.
Before this server, asking for an advanced summary meant manually structuring the prompt, defining context windows, and managing retrieval parameters—all before you even hit 'send.' It was a process reserved for dedicated dev scripts.
Now, calling `rag_query` handles all that. You just ask the question. The R2R server manages the complex steps of vector retrieval, chunking, and summarization so your agent delivers clean, actionable text.
Common Questions About R2R MCP
What URL should I use for the R2R API URL? +
If you are running R2R locally via Docker, it's typically http://localhost:7272. If you are using SciPhi Cloud or have it deployed on your own infrastructure, provide the exact public or private endpoint.
Do I need an R2R API Key? +
It depends on your deployment. Open deployments for local testing may not require a key. Production deployments or SciPhi Cloud environments require you to provide the generated key.
What is the difference between RAG and Search? +
The search tool issues a standard vector similarity match—it returns relevant raw snippets from your database. The rag_query tool asks the R2R server to perform the search and compute an intelligent answer wrapping those snippets using an LLM.
Are document ingestions possible via chat? +
No. This integration is designed for observational toolsets (listing documents, inspecting states, querying the index). Heavy ingestions of PDFs or websites should be handled through scripts or the dashboard.
How do I verify that the R2R server is operational using the `get_health` tool? +
The get_health tool confirms active connectivity. A successful response, like status: ok, means your AI client can send vector operations without connection issues.
What information does running `list_documents` provide about my knowledge base? +
list_documents gives you an inventory of every file ingested into the R2R system. It lists document IDs and names, allowing you to know what data is available before querying it.
What should I do if my `rag_query` request fails or times out? +
If a query fails, check your server logs for specific errors. Large queries might hit rate limits; try breaking the task down into smaller chunks to prevent timeouts.
Using `get_document`, what metadata can I retrieve about a single file? +
The get_document tool pulls specific details and attributes for one known document ID. You get structured data like creation dates, collection IDs, or author info.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
LangSmith
Observability and evaluation platform for LLM applications — monitor traces, debug agent runs, and track performance metrics across your AI stack.
LangGraph Cloud (Stateful AI Agents)
Orchestrate stateful AI agents via LangGraph Cloud — manage assistants, monitor conversation threads, and handle human-in-the-loop overrides.
Portkey
AI gateway observability: monitor logs, costs, and manage LLM configurations via agents.
You might also like
Fortnite Cosmetics & Item Shop
The definitive server for Fortnite cosmetics — track daily shop rotations, leaked skins, and rarity via AI.
AgentFire
Build high-converting real estate websites, manage property listings, and capture leads for your brokerage with ease.
Matrix Operations Engine
Perform exact linear algebra — multiply, transpose, invert, and compute determinants of massive matrices local. Zero LLM math hallucinations.