# ElevenLabs MCP

> ElevenLabs MCP generates lifelike speech from text using advanced neural voice synthesis. It lets you clone voices, access a library of standard and custom tones, and manage your entire audio history programmatically through any AI client.

## Overview
- **Category:** ai-frontier
- **Price:** Free
- **Tags:** elevenlabs, text-to-speech, voice-cloning, tts-api, audio-generation, neural-audio, multilingual-voices, mcp

## Description

This connector gives your agent control over professional-grade audio production. You can send raw text and get back high-fidelity audio files, whether you're creating video voiceovers or building an automated notification system. It handles the complex parts: selecting the right model, ensuring consistent branding by accessing cloned voices, and keeping a detailed record of every file generated. When you connect this through Vinkius, your agent becomes a dedicated studio producer for all things audio.

## Tools

### delete_history_item
Removes a specific recorded audio file from the generation history.

### delete_voice
Permanently deletes a custom-cloned voice profile you created.

### get_download_link
Provides a direct, temporary URL to download any specific audio file.

### get_history_item
Fetches the detailed metadata for one specific entry in your generation history log.

### get_subscription_info
Checks your current usage metrics, remaining credits, and billing plan details.

### get_account_info
Retrieves general details about your user account and subscription status.

### get_voice_settings
Reads or updates fine-tuning parameters used for customizing how a voice sounds.

### get_voice
Retrieves detailed information about a specific voice profile's characteristics.

### list_audio_history
Lists all recorded audio generation events, providing an overview of what has been created.

### list_models
Shows the currently available neural audio models for selection (e.g., Multilingual v2).

### list_voices
Provides a comprehensive list of all voices, both standard and custom-cloned.

### text_to_speech
Converts specified text content into an audio file using the chosen voice and model.

## Prompt Examples

**Prompt:** 
```
Convert this text to speech using 'Bella' voice: 'Welcome to the world of neural audio'.
```

**Response:** 
```
Audio generation triggered! I've converted your text using the 'Bella' voice (ID: EXAV...vXU). You can find the record in your history or I can retrieve the download link for you.
```

**Prompt:** 
```
List all my available voices in ElevenLabs.
```

**Response:** 
```
I've retrieved your voice library. You have access to 10 standard voices and 2 custom cloned voices ('CEO Voice' and 'Narrator'). Which one would you like to use?
```

**Prompt:** 
```
Check my remaining character limits for this month.
```

**Response:** 
```
Scanning subscription... You have 15,420 characters remaining out of your 30,000 monthly limit. Your next reset is in 12 days. Would you like to view your generation history?
```

## Capabilities

### Generate Audio from Text
Convert any block of text into an audio file using multiple neural models and voices.

### Manage Voice Library
Access, list, and even delete custom-cloned or standard voice profiles to maintain brand consistency.

### Track Usage and History
Retrieve detailed records of all past audio generations and monitor your remaining character count and subscription limits.

### Get Download Links
Fetch direct URLs for any previously generated audio file, bypassing manual download steps.

### Configure Voice Settings
Read or update specific parameters related to voice fine-tuning and model selection.

## Use Cases

### Updating product tutorials for new clients
A technical writer needs to update a complex guide. They send the raw text and ask their agent to use the 'CEO Voice' profile, triggering `text_to_speech`. The resulting audio is then passed directly into the documentation build pipeline.

### Building an automated IVR system
A developer needs a voice for customer service. They first call `list_voices` to check available tones, then use `get_voice_settings` to fine-tune the tone, before passing the script through `text_to_speech`.

### Auditing audio assets post-campaign
A marketing team needs to know how many credits they spent last month. They call `list_audio_history`, which shows all past generations, followed by a check with `get_subscription_info` for the final balance.

## Benefits

- Instantly create voiceovers: Instead of manually copy-pasting text into a web editor, your agent sends the content directly to the `text_to_speech` tool and gets an audio file back.
- Maintain brand consistency: Use `list_voices` and `get_voice` to manage your complete library. You can clone voices so every piece of generated audio sounds like your company's spokesperson, every time.
- Never worry about credits again: Use `get_subscription_info` to check remaining character limits or `get_account_info` for an overall view of your usage without leaving your workspace.
- Full audit trail: You can use `list_audio_history` and then `get_history_item` to review exactly what was generated, when it happened, and who triggered it.
- Clean up assets easily: If a voice or an old recording is no longer needed, you can use `delete_voice` or `delete_history_item` to clear out clutter.

## How It Works

The bottom line is: you tell your AI client what voice and text to use, and it handles the complex backend API calls needed for generation.

1. First, you subscribe to this MCP connection and retrieve your API Key from your ElevenLabs account.
2. Next, instruct your AI agent to perform a task, like converting text or listing available voices. The MCP uses the key to communicate with the service.
3. Finally, your agent returns structured data, which could be an audio file link, voice metadata, or usage stats.

## Frequently Asked Questions

**How do I make sure my brand sounds consistent using ElevenLabs MCP?**
Use `list_voices` to see all available profiles. If you have a custom tone, ensure you specify that voice ID when calling `text_to_speech`. This guarantees the correct branding every time.

**Can I check my credit usage with ElevenLabs MCP?**
Yes. You call `get_subscription_info` to get an immediate readout of your remaining character count and billing cycle details, all within your agent's response.

**What is the difference between list_audio_history and get_download_link?**
`list_audio_history` gives you a summary log (a list of what was done). `get_download_link`, however, provides the actual URL needed to grab the finished audio file.

**How do I delete an old voice using ElevenLabs MCP?**
You must first confirm which profile you want to remove. Then, use `delete_voice` and specify the exact ID of the voice you are deleting.

**When I use `list_models`, how do I choose the best audio quality for my content?**
The agent presents a list of all available neural models, allowing you to select based on specific needs. For instance, if your content is multilingual, selecting a dedicated model guarantees better stability and tone across different languages.

**What happens with `text_to_speech` if I input text that exceeds my character limit?**
The tool won't fail silently. Instead, your AI client reports an explicit rate limit error message. This response details exactly how many more characters you can use and when your usage resets.

**I need to remove a specific audio generation record; what does `delete_history_item` do?**
This tool permanently deletes one specified entry from your audit log. It's useful for maintaining privacy or cleaning up records for content you no longer need visible in your history.

**How can I access voice fine-tuning options using `get_voice_settings`?**
You use this tool to adjust the specific parameters of a cloned voice. This lets you refine attributes like pitch, emphasis, or speaking style before running a new text conversion.

**How do I find my ElevenLabs API Key?**
Log in to your account, click your profile icon (bottom left), and navigate to the API Key section to generate or copy your token.

**Which model should I use for multiple languages?**
The `eleven_multilingual_v2` model is recommended for high-quality speech generation in over 29 different languages.

**Can I get a direct download link for a past generation?**
Yes! Use the `get_download_link` tool with a history item ID to retrieve a temporary URL for the audio file.