# AudioStack MCP for AI Agents MCP

> AudioStack lets your AI agents run a complete audio production studio from natural conversation. It generates professional, high-quality speech using over 700 synthetic voices and handles complex mixing and mastering for content creators and ad agencies alike.

## Overview
- **Category:** image-video
- **Price:** Free
- **Tags:** ai-voice, text-to-speech, audio-production, synthetic-media, audio-mixing

## Description

Need to build audio assets? This MCP connects your agent directly to AudioStack, turning simple text commands into finished, polished audio tracks. You can generate studio-quality voiceovers in dozens of languages using a massive library of synthetic voices. It goes way beyond basic text-to-speech; you tell the system what you want—a story, an ad, or a complex soundscape—and it builds the whole thing for you. This capability is crucial for content creators needing rapid asset generation. By connecting AudioStack via Vinkius, your AI agent gains access to professional mixing and mastering tools that handle everything from voice tracks to background music templates. You just talk through the project goals, and the system produces polished audio files ready for distribution.

## Tools

### get_voice_details
Retrieves specific details about a chosen synthetic voice for confirmation and usage planning.

### list_media_files
Shows you all the audio files you've uploaded or generated through your account history.

### list_sound_templates
Provides an inventory of available music and sound design templates ready for use in a project mix.

### list_voices
Searches the entire voice library, allowing you to filter by language, gender, or specific provider characteristics.

### text_to_speech
Converts any given text string into spoken audio using a selected AI voice model.

### create_audioform
Assembles and generates a fully mixed audio piece by combining multiple elements like music, voices, and sound effects.

### create_mix
Applies professional mixing and mastering techniques to existing or newly generated audio tracks automatically.

### create_story
Builds a complete, long-form narrative audio piece optimized for podcasting or serialized content.

### get_audioform
Checks the status and ultimately retrieves the final URL for an audio production you requested.

### get_usage_analytics
Provides a metric breakdown of your account's usage history to track costs and consumption.

## Prompt Examples

**Prompt:** 
```
I need an ad for my new coffee blend. Use the 'Friendly' voice in Spanish. Make it 45 seconds long.
```

**Response:** 
```
**Ad Script: El café perfecto.** 

*(Sound of gentle espresso machine running)* 
**Voice:** ¿Cansado del mismo sabor? Descubre nuestro Blend Élite, el toque de energía que tu mañana necesita. Desde el primer sorbo hasta el último, te acompaña con notas profundas y un aroma irresistible.

*[Mixing applied: Gentle jazz background music fades in and out.]*

**Status:** Complete. Audio file available here. Use `create_mix` if you need to change the genre or tempo.
```

**Prompt:** 
```
Can you list all the sound templates related to corporate video intros?
```

**Response:** 
```
**Available Sound Templates: Corporate Intro Pack**

| Template ID | Style | Duration (sec) | Usage Notes |
| :--- | :--- | :--- | :--- |
| `T901` | Inspiring Cinematic | 20 | Best for company mission statements. |
| `T902` | Modern Minimalist | 15 | Use with quick cuts and sharp voice delivery. |
| `T903` | Uplifting Orchestral | 30 | Excellent choice for major product launches. |

Please let me know which one you want to use in your next audioform.
```

**Prompt:** 
```
What are the best voices for a serious, documentary-style narration about history?
```

**Response:** 
```
I filtered the voice library based on depth and professionalism. Here are three top recommendations:

* **Voice ID: v312** (Deep Male): Excellent gravitas; perfect for historical documentaries.
* **Voice ID: v709** (Mature Female): Clear, authoritative tone; ideal for academic lectures.
* **Voice ID: v450** (Neutral/Universal): Highly flexible and consistent; works well regardless of the topic.

Would you like a sample readout for any of these?
```

## Capabilities

### Generate High-Quality Speech
Produce realistic speech recordings using a deep library of over 700 synthetic voices across multiple languages.

### Compose Complex Audio Productions
Build multi-layered audio files that combine voice, music, and sound effects into one cohesive unit.

### Automate Mixing and Mastering
Apply professional industry standards to mixed audio tracks, handling equalization, compression, and final polish automatically.

## Use Cases

### Localizing Global Ad Campaigns
An ad agency needs to run a campaign in five different countries. Instead of booking voice actors, they prompt the agent: 'Generate the same script for all 5 languages using professional male voices.' The system handles multiple calls to `text_to_speech` and ensures consistent tone across every locale.

### Creating Educational Course Material
A curriculum designer needs a module on particle physics. They ask the agent to 'Create an audio story about quantum entanglement.' The system uses `create_story`, pulling in appropriate sound effects and background music from templates, resulting in a ready-to-use podcast chapter.

### Developing Interactive Video Content
A video editor needs a trailer that mixes voiceover with specific ambient sounds. They instruct the agent to 'Mix this script (voice) with the forest ambience template.' The system calls `create_audioform` and delivers a polished, single-file asset.

### Testing Voice Variations for Characters
A game developer needs five distinct character voices. They use `list_voices` to browse options, then run quick tests using `text_to_speech` on a sample line ('Welcome to the village') with each voice until they find the right fit.

## Benefits

- Scale content output instantly. Instead of spending hours recording voiceovers, you use `text_to_speech` to generate thousands of words across multiple languages in minutes.
- Produce professional broadcast quality assets. The automated mastering function handles mixing and polish that usually requires dedicated studio engineers every time you call `create_mix`.
- Build complex media without code. Use the descriptive structure with `create_audioform` to combine voices, music templates, and effects into a single, cohesive piece.
- Manage your assets efficiently. You can use `list_voices` to quickly find the perfect voice for a script or `list_sound_templates` to pull background music ideas.
- Keep track of everything you make. Use `get_usage_analytics` and `list_media_files` to maintain a clean, auditable record of every piece of content generated.

## How It Works

The bottom line is: you skip the manual studio work. You tell your AI what to make, and it handles the entire production chain for professional results.

1. Subscribe to the AudioStack MCP and enter your unique API Key into your preferred AI client.
2. Prompt your agent with a natural language request, such as 'Generate a 60-second ad for coffee using a friendly male voice in Spanish.'
3. The agent executes the necessary steps—voice selection, audioform creation, mixing, and mastering—and returns the final audio file URL.

## Frequently Asked Questions

**How do I use AudioStack MCP to generate voiceovers in multiple languages?**
You simply instruct your agent with the script and the target language. The system handles calling `text_to_speech` for each locale, ensuring consistent quality and tone across all versions.

**Can AudioStack help me mix my own recorded audio files?**
Yes. You can upload your tracks and use the MCP's mixing tool to apply professional mastering techniques. It levels out volume, removes background noise, and applies EQ so everything sounds cohesive.

**Is AudioStack better than just using a basic text-to-speech generator?**
Definitely. Basic generators only handle speech. This MCP lets you combine voices with music, effects, and templates into one complex asset—that's where the real power is.

**What kind of content can I create using AudioStack MCP?**
You can build almost anything: educational courses, localized marketing ads, narrative podcasts, or even interactive audio dramas. It’s an end-to-end studio in one place.

**How do I manage all the voices and templates? Is it hard to find what I need?**
The MCP provides asset management tools. You can use `list_voices` or browse sound templates, making sure you always know exactly which assets are available for your project.

**If I'm a developer, how do I integrate this into my app?**
You connect the MCP to your development workflow. Your agent can then use natural language commands to trigger audio generation and pull the resulting media file directly into your application logic.