# Exa AI MCP MCP

> Exa AI connects your agent to advanced web discovery, letting you search the internet by meaning rather than keywords. Use this tool to find similar websites to competitors, pull clean text content from results, and get detailed API usage stats—all without ever opening a browser.

## Overview
- **Category:** artificial-intelligence
- **Price:** Free
- **Tags:** semantic-search, web-discovery, information-retrieval, content-analysis, url-matching

## Description

This connector lets your agent act like a real-time web analyst. Instead of building complex, multi-step queries that fail on simple keywords, you ask your AI client a question, and it searches the entire web based on what words *mean*. You can instantly get search results for specific topics, find sites structurally similar to any URL you provide, and pull clean text from those pages so your agent works with structured data. If you're building complex systems, keep in mind that all these actions run on Vinkius, where credentials pass through a zero-trust proxy. This means your API keys are used only in transit; they never get stored on disk. You can then use this MCP alongside other services—like a CRM or messaging platform—to build automations that span multiple parts of your business.

## Tools

### find_similar
Finds websites that share characteristics with a specific URL you provide.

### get_api_usage
Retrieves your current API usage and detailed crawl statistics for budget control.

### get_contents
Pulls the cleaned HTML or text content associated with a specific search result ID.

### get_search_links
Gets only the top links for a given search query, skipping content retrieval.

### search_web
Performs a general semantic web search using Exa's understanding of language.

### search_with_contents
Searches the web and simultaneously retrieves the full, clean text content for the best results.

## Prompt Examples

**Prompt:** 
```
Find 5 high-quality research papers about 'Climate Change mitigation' using Exa.
```

**Response:** 
```
I've retrieved 5 highly relevant research papers. Notable results include studies on carbon capture and renewable energy adoption. Would you like the links and summaries for each?
```

**Prompt:** 
```
Find sites similar to https://techcrunch.com.
```

**Response:** 
```
I've found 10 sites similar to TechCrunch. Top matches include VentureBeat, The Verge, and Wired. I can provide the relevancy scores for each if you'd like.
```

**Prompt:** 
```
Search for 'future of AI robotics' and give me the full content of the best result.
```

**Response:** 
```
Search complete! The top result is a detailed analysis from MIT. I've retrieved the cleaned content, which discusses embodied AI and humanoid developments. Would you like a summary?
```

## Capabilities

### Find Site Neighbors
Identify websites similar to any given URL, helping you map out competitive ecosystems or related industry sources.

### Extract Page Content
Pull clean HTML and text from specific search results so your agent doesn't just get links, but the actual readable data.

### Semantic Web Querying
Search the web using natural language understanding, retrieving information based on meaning rather than exact keywords.

### Batch Search and Content Retrieval
Perform a search query and pull the full content of the top results in one single request for efficiency.

### View Usage Metrics
Get an immediate report on your API usage and crawl statistics to manage your research budget.

## Use Cases

### Mapping a New Market Niche
A market analyst wants to see who's competing with a new SaaS product. They use `find_similar` on the primary competitor's URL, then run `search_web` for common industry terms. The agent compiles a list of related sources and potential rivals.

### Analyzing Technical Whitepapers
A developer needs to compare technical claims across three different papers. They use `search_with_contents` to query the topic, and then run `get_contents` on the top results to pull clean text blocks for side-by-side comparison.

### Quick Competitive Benchmarking
A content strategist needs quick links to all sources mentioning 'AI ethics' in a specific region. They use `get_search_links` with the query, then pass those top 10 links into an agent for summarizing.

### Auditing Content Sources
An operations lead needs to verify that all recorded data points are from authoritative sources. They use `get_api_usage` first to ensure the crawl depth is correct, then run targeted searches.

## Benefits

- Stop clicking through pages. By using `search_with_contents`, your agent gets the clean text and full context in one go, cutting out manual reading time.
- Need to know a competitor's playbook? Use `find_similar` to map out related sites instantly. It’s like getting a whole industry cluster view from one API call.
- Never run out of budget without knowing it. The `get_api_usage` tool gives you full visibility into your crawl statistics, so you control the research spend.
- Efficiency matters. Running `search_web` or `search_with_contents` lets you get rich data immediately, eliminating the need for multiple API calls to different endpoints.
- Targeted data retrieval is key. If you just want links and nothing else, use `get_search_links`. It keeps your request light and fast.

## How It Works

The bottom line is, you stop manually browsing and start asking your AI client to do deep web research for you.

1. First, specify the target. Tell your agent whether you need a general web search or if you want content retrieved alongside the results.
2. Next, execute the request using one of the specialized tools. The connector handles the semantic query and data extraction process in the background.
3. You get back structured data: top links, clean text blocks, or usage stats—ready for your agent to analyze immediately.

## Frequently Asked Questions

**How do I use search_web vs search_with_contents?**
Use `search_web` if you only need the list of top links and don't care about reading the actual content. Use `search_with_contents` when you absolutely need clean text blocks from those results to analyze.

**What is find_similar?**
`find_similar` lets your agent identify websites that are structurally or thematically related to a URL you provide. This is great for competitor mapping and finding adjacent market players.

**Can I check my spending with get_api_usage?**
Yes, calling `get_api_usage` gives you an instant report on your API usage and crawl statistics. This is the tool to use when managing research budgets for a project.

**What does get_contents do?**
`get_contents` pulls clean HTML/text from specific search results using their IDs. It's useful after you have already identified promising links and need to pull the data in bulk.

**What credentials do I need before running a search with `search_web`?**
You must provide your Exa API key. Vinkius handles secure credential passing through its zero-trust proxy. Your keys are used only in transit and never stored on disk.

**Does the tool `get_search_links` return full content or just links?**
It provides only the top links for your query. This is useful when you need quick URLs to monitor many sources without spending tokens retrieving entire page contents first.

**What happens if I exceed rate limits when using `search_with_contents`?**
The MCP will automatically handle the throttling and return a structured error code. Your agent receives clear feedback, allowing it to pause or retry the task without failing completely.

**What if I use `find_similar` with an invalid URL?**
The tool validates the input immediately. If the provided URL is inaccessible or malformed, it returns a specific error message rather than attempting to process unusable data.