# LocalAI MCP

> LocalAI lets you run powerful AI models—including text chat, image generation, audio transcription, and face analysis—entirely on your own hardware. It provides a standard API endpoint compatible with OpenAI and Anthropic protocols, letting any client connect to private local models without sending sensitive data to the cloud.

## Overview
- **Category:** ai-frontier
- **Price:** Free
- **Tags:** self-hosted, llm-inference, image-generation, audio-processing, openai-compatible, local-models

## Description

This MCP lets you bring advanced artificial intelligence capabilities right into your local environment. Instead of relying on third-party services for every single task, you can run powerful multimodal models directly from your own infrastructure. This means keeping all your sensitive data private while still accessing top-tier AI performance.

Whether you need to generate complex images from text prompts, convert recorded speech into searchable text, or analyze faces for identity verification, this connector handles it locally. You connect your preferred agent through Vinkius and gain access to a comprehensive set of tools that span everything from basic chat completions using `chat_completions` to advanced functions like generating vector embeddings with `create_embeddings`. It's about giving you full control over where the AI processing happens, ensuring speed and privacy are always priorities.

## Tools

### anthropic_messages
Generates multi-turn chat messages using local models compatible with Anthropic’s API structure.

### apply_model
Installs a new AI language or media model from the available gallery.

### chat_completions
Generates conversational text responses using local models compatible with OpenAI’s API structure.

### create_embeddings
Converts blocks of text into numerical vector embeddings for advanced search and indexing.

### detect_objects
Scans an image and returns a list of identified objects along with their locations.

### face_analyze
Provides demographic or characteristic analysis on human faces found in images.

### face_identify
Compares a face to previously registered individuals to determine who the person is (1:N comparison).

### face_register
Enrolls and securely stores a new individual's facial data for future identification.

### face_verify
Confirms if an unknown face matches a known identity by comparing it one-to-one.

### generate_image
Creates entirely new visual content based on your text prompts, supporting negative prompts to filter out undesirable elements.

### get_auth_status
Checks the current authentication status and lists available identity providers.

### get_auth_usage
Displays usage metrics for personal API tokens or access keys.

### get_system_info
Retrieves general operational details and backend information about the local AI instance.

### get_version
Returns the specific version number of the LocalAI software running on the infrastructure.

### list_models
Retrieves a list of all AI models that are currently installed and ready for use by your agent.

### open_responses
Generates open-ended, unstructured text responses when specific chat protocols aren't required.

### rerank_documents
Refines search results by reordering documents based on how closely they relate to your specific query.

### text_to_speech
Converts plain text into an audio file using high-quality synthetic voice generation (TTS).

### transcribe_audio
Transcribes recorded speech files or paths, converting the spoken word back into editable text.

## Prompt Examples

**Prompt:** 
```
List all models available on my LocalAI instance.
```

**Response:** 
```
I've retrieved the list of models. You have 'llama-3-8b', 'stablediffusion', and 'whisper-1' currently active and ready for use.
```

**Prompt:** 
```
Generate a chat response using the 'llama-3' model about the benefits of local AI.
```

**Response:** 
```
Using the 'llama-3' model: Local AI offers significant benefits including enhanced data privacy, reduced latency, and the ability to operate without an internet connection...
```

**Prompt:** 
```
Create an image of a futuristic library using the 'stablediffusion' model.
```

**Response:** 
```
I've initiated the image generation for a 'futuristic library'. The process is complete, and you can view the generated image at the provided local URL.
```

## Capabilities

### Run Chat and Text Generation
You generate text responses for chat or completions using local language models that support both OpenAI and Anthropic standards.

### Create Visual Media
You prompt the system to synthesize unique images from scratch, even allowing you to define negative prompts to exclude unwanted elements.

### Process Audio Files
You convert spoken audio into written text using transcription or generate natural-sounding speech files from plain text.

### Identify and Analyze Faces
You verify a person's identity by comparing faces one-to-one, enroll new individuals, or detect objects within an image for analysis.

### Improve Data Retrieval
You generate vector embeddings to index text and use those vectors to improve search results based on a specific query.

## Use Cases

### Compliance Auditing for Biometrics
An HR department needs a tool that verifies employee identities using photos taken at different sites. Instead of sending images offsite, they connect the MCP and use `face_verify` to perform 1:1 biometric checks entirely within their private network.

### Creating Localized Marketing Assets
A marketing team needs dozens of unique product mockups for a campaign. They send a text description to the agent, which then uses `generate_image` to output high-res visuals without incurring massive cloud API costs.

### Building Internal Call Summaries
A sales team records client calls on internal VoIP systems. They connect the MCP and use `transcribe_audio` immediately, then pass the resulting text to `chat_completions` to generate structured follow-up summaries for CRM entry.

### Improving Knowledge Base Search
A legal firm has thousands of documents. Instead of just searching by keyword, they use `create_embeddings` across their entire corpus and then employ `rerank_documents` to ensure the agent retrieves the single most contextually relevant passage for a query.

## Benefits

- Data Privacy: By running everything locally, you eliminate the risk of sending proprietary or sensitive data to any third-party cloud vendor. This is non-negotiable for compliance and internal tools.
- Control Over Models: You maintain full control over which AI model runs your workflows. Need to test a new open-source LLM? Just apply it locally with `apply_model` and start using it immediately.
- Full Media Pipeline: This MCP covers the whole stack. Generate images with `generate_image`, transcribe audio with `transcribe_audio`, and then convert summaries back into voice using `text_to_speech`—all without an internet dependency.
- Advanced Search: Go beyond basic keyword searches. Use `create_embeddings` to index your documents, and then use `rerank_documents` to guarantee the most contextually relevant answers for RAG workflows.
- Biometric Capabilities: Handle identity management securely. You can run specific tools like `face_register` or `face_verify` to process sensitive biometric data entirely on private hardware.

## How It Works

The bottom line is that you treat your private, locally hosted AI instance exactly like a cloud API endpoint from anywhere in the Vinkius catalog.

1. Subscribe to this MCP, providing your LocalAI Base URL (e.g., http://localhost:8080) and an optional API Key.
2. Your AI client connects using the provided credentials, establishing a secure link to your local models.
3. You interact with the system through your agent, triggering actions like text generation or image synthesis as if it were any other online service.

## Frequently Asked Questions

**How do I start using LocalAI with chat_completions?**
You first connect your client to this MCP and ensure you have a local LLM installed via `apply_model`. Then, your agent can call the `chat_completions` tool just like it would any other API.

**Can I run image generation if my data needs to stay private?**
Yes. By using the MCP, you leverage local models for media creation. You simply call `generate_image`, and the visual content is processed entirely on your own hardware.

**What's the difference between face_identify and face_verify?**
Face verification (`face_verify`) confirms if a single unknown face matches a known person (1:1). Face identification (`face_identify`) determines who a person is by comparing their face against many registered identities (1:N).

**Does LocalAI help me search my documents better?**
Absolutely. Instead of basic keyword searches, you use `create_embeddings` to build searchable vectors from your documents and then use `rerank_documents` to improve the relevance of retrieved results.

**How do I make sure my audio files are processed correctly?**
You must first pass the file path or raw data through the `transcribe_audio` tool. This converts the speech into text, which you can then use with any of the other chat tools.