# watsonx Discovery MCP

> watsonx Discovery connects your AI agent directly to massive, unstructured data collections. This MCP gives you a cognitive search engine that doesn't just keyword match; it understands natural language and surfaces hidden patterns from documents across your enterprise. Stop wading through complex cloud consoles—just ask questions about your knowledge base.

## Overview
- **Category:** industry-titans
- **Price:** Free
- **Tags:** cognitive-search, nlp, unstructured-data, semantic-search, text-analytics, enterprise-search

## Description

You can connect your AI agent to IBM watsonx Discovery and treat your entire document repository like a single, searchable conversation. Instead of manually running queries or digging through technical console dashboards, you simply chat with the system using natural language. The MCP uses advanced text analytics to read everything—from legal contracts to internal reports—and surface only what you need. You can ask complex questions about multiple documents at once and get actionable answers immediately. This capability is hosted on Vinkius, making it easy for your AI client to access deep enterprise knowledge without needing specialized coding skills. It’s like having a data scientist who lives inside your chat window.

## Tools

### get_document_details
Pulls metadata and status for a single indexed document, showing its technical details.

### get_component_settings
Retrieves the configuration settings and health metrics for all project components.

### list_discovery_collections
Lists every data collection available in your current watsonx Discovery project.

### list_collection_documents
Provides a list of all specific documents contained within a selected data collection.

### list_available_enrichments
Lists every NLP model, such as Sentiment or Entity recognition, configured for your project.

### query_discovery_content
Executes a natural language question or DQL query against a specific data collection.

## Prompt Examples

**Prompt:** 
```
List all my Discovery collections.
```

**Response:** 
```
I found 3 collections in your project: 'Legal Documents' (ID: col-1), 'Technical Support KB' (ID: col-2), and 'Marketing Research' (ID: col-3). Which one would you like to query?
```

**Prompt:** 
```
Search the 'Legal Documents' collection for 'contract termination clauses'.
```

**Response:** 
```
I found several matches. The most relevant document mentions that termination requires a 30-day written notice and highlights specific liability limitations. Would you like me to pull the full document text?
```

**Prompt:** 
```
What enrichments are currently active in my project?
```

**Response:** 
```
Your project has 4 active NLP enrichments: 1. Sentiment Analysis, 2. Entity Extraction (People, Places, Organizations), 3. Category Classification, and 4. Keyword Extraction. These are applied to all documents during ingestion.
```

## Capabilities

### Search across collections
You perform natural language or DQL queries against multiple data sources to find relevant information.

### Map project contents
The MCP lists all available data collections and the specific documents within them, helping you understand your scope.

### Analyze document structure
You retrieve technical metadata for single indexed files, checking ingestion status or identifying key details.

### Check data quality and health
The system verifies project component configurations and monitors the overall health of your discovery environment.

### Review applied intelligence models
You list all NLP enrichments, like Sentiment or Entity extraction, to see what type of analysis is running on your data.

## Use Cases

### Finding obscure contract details
A legal analyst needs to know every document mentioning 'indemnification clause' across three different collections. They use their agent with `query_discovery_content` and the MCP aggregates results from all relevant data sets, providing a consolidated summary they can read immediately.

### Onboarding new team members
A new developer needs to know what data sources are available for their project. They simply call `list_discovery_collections`, instantly receiving an inventory of all possible knowledge bases and where to start querying.

### Troubleshooting failed pipelines
The data science team notices some documents aren't indexing correctly. They use `get_component_settings` to check the system health, immediately pinpointing which component is failing before having to manually investigate logs.

### Checking document readiness
A product manager needs to confirm if a specific policy document was successfully indexed. They use `get_document_details` and instantly get the full metadata and ingestion status, confirming it's ready for search.

## Benefits

- Find answers instantly. Instead of manually building complex queries, you simply ask your agent a question like, 'What are the termination requirements for contract X?' and get the answer directly from the data using `query_discovery_content`.
- Audit your data source easily. Use `list_available_enrichments` to see exactly what kind of analysis (like keyword extraction) has been run on your documents—no more guessing if the data is clean.
- Track project health in real time. The MCP runs checks using `get_component_settings`, letting you know instantly if an ingestion pipeline failed or needs attention, saving hours of dashboard clicking.
- Understand your entire scope. Start with `list_discovery_collections` to map out every data set available. This gives you a clear view of everything the system can search before you write a single query.
- Verify document integrity. If you need to know the status or metadata for one file, use `get_document_details`. It's a quick way to check if a specific record is ready and indexed correctly.

## How It Works

The bottom line is that you never have to leave your chat window to analyze deep, enterprise-level document data.

1. Subscribe to this MCP and provide your watsonx URL, API Key, and Project ID.
2. Your AI agent uses the credentials to connect and verify access to your cognitive data collections.
3. You ask a complex question in plain language; the system runs the query and returns targeted answers drawn from your documents.

## Frequently Asked Questions

**How do I start searching with the watsonx Discovery MCP?**
You first need to provide your specific watsonx credentials and project ID. Once connected, you can use `query_discovery_content` by simply asking a natural language question.

**What if I want to know what data sources are available? Do I need the watsonx Discovery MCP?**
Yes, use the `list_discovery_collections` tool. This function gives you an immediate inventory of every collection in your project so you can plan your query.

**Can this MCP help me check if a document is ready to be searched?**
Absolutely. Use `get_document_details` on the specific file ID. This tool retrieves the metadata and ingestion status, confirming it's indexed and available for querying.

**Does watsonx Discovery help with NLP analysis? Which tools are involved?**
The MCP lists active enrichments using `list_available_enrichments`. This tells you if the document has Sentiment or Entity tags applied, which enhances your ability to query.

**I have multiple documents. Can I search them all at once with watsonx Discovery?**
Yes. By using `query_discovery_content`, you can write a prompt that directs the agent to look across several collections simultaneously, consolidating the findings.