Open WebUI MCP. Manage Models, Knowledge, and Chats from Anywhere
Open WebUI gives your AI agent full control over local and cloud large language models. List available models, manage knowledge bases by uploading files or processing websites, and run controlled chat sessions—all through natural conversation.
Give Claude and any AI agent real-world access
Retrieve a list of all connected language models, including those running locally via Ollama.
Ingest new information by uploading files or processing web URLs and organizing them into searchable collections for RAG context.
Start, manage, and complete controlled chat conversations using standard OpenAI/Anthropic compatible endpoints.
Directly interact with the Ollama API to generate completions or create embeddings for local model testing.
Check the status of uploaded documents and web content to confirm when they are ready for use in your knowledge base.
Ask an AI about this
Waiting for input…
What AI agents can do with Open WebUI: 12 Tools for LLM Management
These tools give you granular control over model listing, file uploading, web scraping, and structured chat management within Open WebUI.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using Open WebUI MCPAdd File To Collection
Adds a specified file to an existing knowledge collection so it can be used for retrieval.
Chat Completed
Runs specific filters or processing steps after a chat conversation has finished.
Chat Completions
Generates responses based on prompts using an OpenAI-compatible standard chat...
Create New Chat
Initiates a new, structured chat session that is controlled by the backend flow for...
Get File Status
Checks the current status of file processing to ensure documents are indexed and...
List Models
Retrieves a list of all available language models configured in your Open WebUI instance.
Ollama Embed
Generates vector embeddings for text using the local Ollama API embedding function.
Ollama Generate
Requests a completion response directly from a specified model running on the local...
Ollama Tags
Lists all models currently available and tagged within your local Ollama environment.
Process Web Url
Scrapes content from a given web URL, extracts the text, and indexes it into a...
Send Message
Generates messages using an Anthropic-compatible message generation standard.
Upload File
Uploads a file, extracts its content, and stores it in the vector database for future retrieval.
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on each call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Open WebUI, then connect any of our 5,200+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,200+ others, all in one place
- Add new capabilities to your AI anytime you want
- Connections are secured and governed automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog weekly
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Open WebUI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS CLOUD
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on each call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Dealing with disconnected LLM services is exhausting.
Today, managing a knowledge base means jumping through hoops. You upload a PDF to one service, process a website URL in another, and then you have to manually verify the status of each piece before you can even begin asking questions using your chat bot. It's copy-pasting URLs into dedicated ingestion tabs, waiting for separate background jobs to finish, and constantly switching between model API dashboards.
With this MCP, that manual process collapses into a single conversation thread. Your agent handles the messy backend work—from uploading files with `upload_file` to processing web content via `process_web_url`. You just tell it: 'Use all of this data,' and you get actionable answers instantly.
Open WebUI MCP: Centralized Model & Data Control
You eliminate the need to manually track which models are available or if your document ingestion jobs actually finished. Instead of checking multiple dashboards for model status, you use `list_models` and then validate content readiness with `get_file_status`—all in one flow.
The difference is control. You stop reacting to siloed system alerts and start giving direct, guided commands to manage your entire LLM stack.
What Open WebUI MCP does for your AI
This MCP connects your Open WebUI instance to any AI client, letting you handle complex LLM tasks without needing the command line. Instead of jumping between model APIs and document storage systems, you tell your agent what you need done. You can check which models are available (Ollama, OpenAI, etc.), upload documents or paste web URLs to build a knowledge collection, and then use those resources for chat completions.
It’s about treating your LLM infrastructure like another service endpoint. If you're building an internal toolchain, Vinkius makes it simple to connect this control layer to any agent in the catalog, giving you comprehensive oversight of both local and cloud model performance.
019e38cc-df7c-7157-9766-18c9eb569b67 How to set up Open WebUI MCP
The bottom line is you gain a single, conversational interface to manage complex model and data pipelines.
Subscribe to this MCP on Vinkius and provide your Open WebUI Base URL and API Key.
Your AI client connects, giving it the necessary permissions to read model lists and manage files in your Open WebUI backend.
You request a specific action—like adding a document or starting a chat—and the agent executes it through the MCP's exposed tools.
Who uses Open WebUI MCP
AI Engineers who need to test dozens of model combinations. Knowledge Managers stuck manually feeding documentation into chat bots. DevOps teams monitoring local LLM deployments.
Runs tests comparing different models, managing the lifecycle of RAG collections, and debugging complex prompt chains without leaving their IDE.
Needs to quickly ingest corporate documentation or large sets of web articles into a centralized knowledge base for team-wide Q&A.
Monitors the availability and performance of self-hosted Ollama instances, ensuring model tags are correct across development environments.
Benefits of connecting Open WebUI MCP
Gain full model visibility by using list_models to see every available LLM endpoint, whether it’s running on Ollama or a cloud provider.
Build powerful knowledge bases by letting the agent process web content via process_web_url, automatically indexing external data into your collections.
Run controlled chat sessions and full conversation flows using create_new_chat to ensure proper history tracking and context management.
Test local inference directly with Ollama tools. You can use ollama_generate or ollama_embed to validate local model performance instantly.
Keep your RAG pipelines moving by checking file readiness using get_file_status, ensuring no time is wasted waiting on document ingestion.
Open WebUI MCP use cases
Updating the Company Handbook
A Knowledge Manager needs to update internal FAQs. Instead of manually downloading PDFs, they simply ask their agent to process a batch of new web URLs and use add_file_to_collection to add them all to the 'HR Policies' knowledge base.
Benchmarking Local LLMs
An AI Engineer needs to compare Llama 3 vs. Mistral on a specific query. They use ollama_generate for both models side-by-side, allowing them to programmatically benchmark performance without manual API calls.
Handling Live Customer Feedback
A support team wants to analyze recent blog posts. They ask the agent to process a live URL and then use chat_completions against that newly indexed data, getting instant insights into customer pain points.
Debugging Chat Flows
A developer needs to ensure their complex multi-step chat flow works. They use create_new_chat and monitor the session via send_message to verify that context is passed correctly between steps.
Open WebUI MCP tradeoffs
What to watch out for, and the recommended way to handle each one.
Assuming model availability
The user tries to generate content using a specific model name, but doesn't know if it was properly loaded or tagged in the local Ollama environment.
First, run list_models and then check available tags using ollama_tags. This confirms your agent has access to the correct endpoint before attempting generation via ollama_generate.
Treating files as instant context
The user uploads a massive document and immediately asks questions about it, but the system hasn't finished indexing or processing the file.
After uploading content with upload_file, always check the ingestion status first. Use get_file_status to confirm the data is fully processed before querying.
Mixing API standards
Trying to mix proprietary chat endpoints with standard OpenAI calls without proper context management.
When working on structured conversations, always start a fresh session using create_new_chat. This ensures the agent manages the message history correctly, regardless of whether you use send_message or chat_completions.
When to use Open WebUI MCP
Use this MCP if your primary job involves managing and connecting multiple LLM sources—local Ollama instances alongside cloud endpoints like OpenAI or Anthropic. It's the control layer for complex RAG systems, making sure data gets indexed before it's asked about. Don't use this if you just need a single chat interface; then, a basic messaging MCP is enough. Also, don't use it if your primary task is simply writing code—you'll want an IDE-focused tool instead. This MCP excels when the job requires coordinating data ingestion (files/URLs), model selection (list_models), and structured output generation.
Frequently asked questions about Open WebUI MCP
How do I check if my uploaded files are ready for use with Open WebUI MCP? +
You use the get_file_status tool. This function checks the processing status of your documents, letting you know exactly when the data is fully indexed and available for retrieval.
Can I list models running on Ollama using Open WebUI MCP? +
Yes, use ollama_tags to retrieve a list of all currently tagged and available models in your local Ollama environment. This confirms which specific models you can generate completions with.
Is this MCP only for OpenAI-style chats? +
No, it supports multiple standards. You can use chat_completions for OpenAI compatibility or send_message if your workflow requires Anthropic's specific message generation format.
What is the best way to get new knowledge into my collection using Open WebUI MCP? +
For web content, use process_web_url. If you have physical files like PDFs or TXT documents, it's better to use upload_file first.
How do I start a structured conversation flow with Open WebUI MCP? +
Use the create_new_chat tool. This initiates a backend-controlled chat session, which is ideal for multi-step processes where history and context need strict management.