# Haystack (deepset Cloud) MCP MCP

> Haystack (deepset Cloud) MCP lets your AI agent manage complex Retrieval Augmented Generation (RAG) pipelines and search massive document sets. You can list isolated workspaces, run full-scale NLP topologies, trigger immediate vector searches, and inspect metadata attached to source documents—all through natural conversation.

## Overview
- **Category:** friends-mcp
- **Price:** Free
- **Tags:** rag-pipelines, llm-framework, enterprise-search, nlp-topologies, embedding-models, document-retrieval

## Description

This connector gives you deep access to running RAG pipelines using your deepset Cloud account. Instead of building complex API calls every time you need context, you talk to your agent about it. You can list isolated environments (workspaces) for different projects and then inspect the full NLP topologies, seeing exactly where embedding nodes or retriever logic are placed. Need to test a pipeline? Just ask your agent to run a search using specific pipelines, dispatching immediate LLM or Retriever invocations against your data. It’s all about making sure your AI answers come from verifiable sources. By connecting this MCP through Vinkius, you get to manage the entire flow—from listing files and checking metadata to triggering dense vector searches across enterprise knowledge bases. This means your agent doesn't just guess; it grounds every answer in documented truth.

## Tools

### get_file
Retrieves specific metadata attached to an uploaded source file.

### get_pipeline
Fetches detailed information about a single, existing AI pipeline topology.

### list_files
Provides a list of all files that have been uploaded to the knowledge base.

### list_pipelines
Generates a comprehensive list of all active AI pipelines available in your account.

### list_workspaces
Lists the separate, isolated environments used for different search contexts.

### run_pipeline
Executes a full AI pipeline search using specific parameters to test retrieval logic.

### search_documents
Triggers a dense or sparse vector search across all indexed enterprise documents.

## Prompt Examples

**Prompt:** 
```
List all pipelines in my 'production' workspace
```

**Response:** 
```
I found 3 pipelines in your production workspace: 'default-rag', 'semantic-search', and 'hybrid-retriever'. Which one would you like to inspect?
```

**Prompt:** 
```
Run a search for 'AI security compliance' using the default-rag pipeline
```

**Response:** 
```
Running search… Based on your indexed documents, I found 5 relevant snippets. The main compliance requirements are… [results summary]. Would you like to see the source file IDs?
```

**Prompt:** 
```
List the files in my 'knowledge-base' workspace
```

**Response:** 
```
You have 12 files in the 'knowledge-base' workspace. Recent uploads include 'q4_report.pdf', 'security_guidelines.md', and 'api_docs.txt'. I can fetch metadata for any of these files.
```

## Capabilities

### Manage isolated environments
You can list available workspaces, keeping different search contexts and projects separate.

### Inspect data structures
The tool lets you view the details of existing AI pipelines or get metadata about source files.

### Run search tests
You can dispatch an immediate pipeline run to test retrieval logic and see what results come back from your indexed knowledge.

### Search large datasets
The agent triggers dense or sparse vector searches directly over all the documents you’ve uploaded into the index.

## Use Cases

### Debugging a poor answer
The team noticed an agent was giving outdated compliance advice. They used `list_workspaces` to find the correct 'Compliance' environment, then triggered a focused search using `run_pipeline`. The results pointed directly to an incorrectly indexed file, allowing them to fix the source data.

### Onboarding new knowledge
A product team added 50 new technical manuals. Instead of manually checking every document, they used `list_files` and then ran `search_documents` with a query to ensure the embeddings successfully captured key terms from all the new material.

### Comparing two models
A data scientist wanted to compare how two different retrieval mechanisms performed. They used `list_pipelines` to find both 'Model A' and 'Model B', then ran separate searches using `run_pipeline` on the same query, allowing a direct comparison of snippet quality.

### Verifying data integrity
The PM needed confirmation that certain sensitive metadata was attached to key documents. They used `list_files` to find the target document path and then called `get_file` to inspect the exact metadata fields, proving compliance.

## Benefits

- Test retrieval logic immediately. Use the `run_pipeline` tool to dispatch immediate LLM or Retriever invocations, confirming that your agent pulls accurate context before production deployment.
- Manage scope with isolated environments. The ability to use `list_workspaces` means you can test different projects without worrying about cross-contamination of search data.
- Audit the full pipeline structure. Use `get_pipeline` and `list_pipelines` to visualize NLP topologies, letting you see exactly how embedding nodes and retriever logic are wired up.
- Verify document sources. Before building an agent that cites facts, use `search_documents` to trigger vector searches over your index and confirm the relevance of retrieved data snippets.
- Inspect file context easily. You can run `list_files` then use `get_file` to inspect metadata on individual source documents, proving where the information came from.

## How It Works

The bottom line is, you use natural conversation to execute complex ML tasks that previously required multiple API calls and environment setup.

1. Subscribe to this MCP and provide your deepset Cloud API URL and key.
2. Your AI client uses these credentials to establish a connection to your cloud account.
3. You interact with the agent naturally, asking it to list workspaces or run specific searches against your connected knowledge base.

## Frequently Asked Questions

**How do I start testing my RAG pipelines with search_documents?**
You trigger a vector search by asking your agent to execute `search_documents`. This runs dense or sparse searches over all indexed documents, giving you the raw results needed for testing.

**What is the difference between list_pipelines and run_pipeline?**
`list_pipelines` just shows you names; it doesn't do anything. `run_pipeline`, however, executes a full search using one of those listed topologies to test its real-world output.

**Can I check the metadata for a single document using get_file?**
Yes. You can use `get_file` and provide the file's path or ID to retrieve specific metadata attached to that source document embedding.

**Does this MCP help me manage different project contexts?**
Absolutely. You use `list_workspaces` to see all isolated environments, which ensures your agent only searches the documents relevant to the current project or context you're working on.

**How do I use list_files to verify which documents are available before running a search?**
The MCP lets you run list_files first. This shows you all the uploaded files in a workspace, letting your agent know exactly what data it can reference before attempting a complex query or building a pipeline.

**If my RAG results are inaccurate, how can I use get_pipeline to debug the underlying logic?**
You run get_pipeline to inspect the NLP topology. This lets you verify if the embedding nodes and retriever logic are configured correctly for your specific knowledge base, which is key for accurate results.

**When using list_workspaces, how does the MCP ensure my agent queries an isolated environment?**
The system separates contexts by workspace ID. Your AI client uses this ID to guarantee that when you execute a search or run a pipeline, it only accesses data from the specified, isolated knowledge base.

**What should I do if my attempt to run_pipeline fails?**
The MCP will return specific API error details. You check those logs to see if the failure is due to outdated credentials or if the pipeline needs manual adjustment of its source documents.