# Open WebUI MCP

> Open WebUI gives your AI agent full control over local and cloud large language models. List available models, manage knowledge bases by uploading files or processing websites, and run controlled chat sessions—all through natural conversation.

## Overview
- **Category:** loved-by-devs
- **Price:** Free
- **Tags:** llm-management, rag, model-inference, self-hosted, chat-interface, automation

## Description

This MCP connects your Open WebUI instance to any AI client, letting you handle complex LLM tasks without needing the command line. Instead of jumping between model APIs and document storage systems, you tell your agent what you need done. You can check which models are available (Ollama, OpenAI, etc.), upload documents or paste web URLs to build a knowledge collection, and then use those resources for chat completions. It’s about treating your LLM infrastructure like another service endpoint. If you're building an internal toolchain, Vinkius makes it simple to connect this control layer to any agent in the catalog, giving you comprehensive oversight of both local and cloud model performance.

## Tools

### add_file_to_collection
Adds a specified file to an existing knowledge collection so it can be used for retrieval.

### chat_completed
Runs specific filters or processing steps after a chat conversation has finished.

### chat_completions
Generates responses based on prompts using an OpenAI-compatible standard chat completion endpoint.

### create_new_chat
Initiates a new, structured chat session that is controlled by the backend flow for tracking and history.

### get_file_status
Checks the current status of file processing to ensure documents are indexed and ready for querying.

### list_models
Retrieves a list of all available language models configured in your Open WebUI instance.

### ollama_embed
Generates vector embeddings for text using the local Ollama API embedding function.

### ollama_generate
Requests a completion response directly from a specified model running on the local Ollama API.

### ollama_tags
Lists all models currently available and tagged within your local Ollama environment.

### process_web_url
Scrapes content from a given web URL, extracts the text, and indexes it into a knowledge collection.

### send_message
Generates messages using an Anthropic-compatible message generation standard.

### upload_file
Uploads a file, extracts its content, and stores it in the vector database for future retrieval.

## Prompt Examples

**Prompt:** 
```
List all models available in my Open WebUI instance.
```

**Response:** 
```
I've retrieved the models. You have access to 'llama3:latest', 'gpt-4o', and several custom Open WebUI functions. Would you like to use one of them for a chat completion?
```

**Prompt:** 
```
Process the URL 'https://docs.openwebui.com/' into my 'Documentation' collection.
```

**Response:** 
```
I have started processing the URL. The content is being scraped and indexed into the 'Documentation' collection. You can now ask questions based on this data.
```

**Prompt:** 
```
Generate a response using the 'llama3' model for the prompt 'Explain quantum computing'.
```

**Response:** 
```
Using the `ollama_generate` tool with 'llama3': Quantum computing is a type of computing that uses quantum-mechanical phenomena... Would you like more details?
```

## Capabilities

### Inventory Available Models
Retrieve a list of all connected language models, including those running locally via Ollama.

### Build Knowledge Collections
Ingest new information by uploading files or processing web URLs and organizing them into searchable collections for RAG context.

### Manage Chat Sessions
Start, manage, and complete controlled chat conversations using standard OpenAI/Anthropic compatible endpoints.

### Execute Local Inference Tasks
Directly interact with the Ollama API to generate completions or create embeddings for local model testing.

### Monitor Data Status
Check the status of uploaded documents and web content to confirm when they are ready for use in your knowledge base.

## Use Cases

### Updating the Company Handbook
A Knowledge Manager needs to update internal FAQs. Instead of manually downloading PDFs, they simply ask their agent to process a batch of new web URLs and use `add_file_to_collection` to add them all to the 'HR Policies' knowledge base.

### Benchmarking Local LLMs
An AI Engineer needs to compare Llama 3 vs. Mistral on a specific query. They use `ollama_generate` for both models side-by-side, allowing them to programmatically benchmark performance without manual API calls.

### Handling Live Customer Feedback
A support team wants to analyze recent blog posts. They ask the agent to process a live URL and then use `chat_completions` against that newly indexed data, getting instant insights into customer pain points.

### Debugging Chat Flows
A developer needs to ensure their complex multi-step chat flow works. They use `create_new_chat` and monitor the session via `send_message` to verify that context is passed correctly between steps.

## Benefits

- Gain full model visibility by using `list_models` to see every available LLM endpoint, whether it’s running on Ollama or a cloud provider.
- Build powerful knowledge bases by letting the agent process web content via `process_web_url`, automatically indexing external data into your collections.
- Run controlled chat sessions and full conversation flows using `create_new_chat` to ensure proper history tracking and context management.
- Test local inference directly with Ollama tools. You can use `ollama_generate` or `ollama_embed` to validate local model performance instantly.
- Keep your RAG pipelines moving by checking file readiness using `get_file_status`, ensuring no time is wasted waiting on document ingestion.

## How It Works

The bottom line is you gain a single, conversational interface to manage complex model and data pipelines.

1. Subscribe to this MCP on Vinkius and provide your Open WebUI Base URL and API Key.
2. Your AI client connects, giving it the necessary permissions to read model lists and manage files in your Open WebUI backend.
3. You request a specific action—like adding a document or starting a chat—and the agent executes it through the MCP's exposed tools.

## Frequently Asked Questions

**How do I check if my uploaded files are ready for use with Open WebUI MCP?**
You use the `get_file_status` tool. This function checks the processing status of your documents, letting you know exactly when the data is fully indexed and available for retrieval.

**Can I list models running on Ollama using Open WebUI MCP?**
Yes, use `ollama_tags` to retrieve a list of all currently tagged and available models in your local Ollama environment. This confirms which specific models you can generate completions with.

**Is this MCP only for OpenAI-style chats?**
No, it supports multiple standards. You can use `chat_completions` for OpenAI compatibility or `send_message` if your workflow requires Anthropic's specific message generation format.

**What is the best way to get new knowledge into my collection using Open WebUI MCP?**
For web content, use `process_web_url`. If you have physical files like PDFs or TXT documents, it's better to use `upload_file` first.

**How do I start a structured conversation flow with Open WebUI MCP?**
Use the `create_new_chat` tool. This initiates a backend-controlled chat session, which is ideal for multi-step processes where history and context need strict management.