# Lovo AI (Genny) MCP

> Lovo AI (Genny TTS & Voice Synthesis API) connects text to lifelike speech. Your agent uses this server to generate high-quality, controllable voiceovers for videos, podcasts, or ad copy using hundreds of premium voices and granular style controls.

## Overview
- **Category:** productivity
- **Price:** Free
- **Tags:** text-to-speech, voice-synthesis, ai-audio, genny, lovo-ai

## Description

**Lovo AI (Genny) TTS Server - Voice Synthesis API** lets your agent transform plain text into lifelike speech. You'll use this server to generate high-quality, controllable voiceovers for anything—videos, podcasts, or ad copy—by leveraging hundreds of premium voices and granular style controls.

To get started, you first need a handle on what's available. Run `list_voices` if you wanna see an exhaustive catalog of all the AI speakers we got going. You can filter this massive list by language or gender to narrow down your search immediately.

Once you know which voices exist, you gotta pick the right one. Use `get_speaker` when you want deep metadata on a specific speaker ID. This pulls detailed information about that voice's capabilities and parameters; it's how you nail the perfect tone for your content before committing to an audio file.

When you're ready, you submit the actual text using `create_tts_job`. You pass in the source material along with specific synthesis parameters—things like desired speed or emotional style. This job submission doesn't give you the audio; it gives you a unique Job ID. Keep that ID safe because you need it next.

Because generating high-quality speech takes time, you don't get the file right away. You gotta use `get_tts_job` with your saved Job ID. This function checks the status of the job—it tells you if the audio is still processing in the background, if something failed and needs a retry, or if it’s finally ready for download.

The whole process flows like this: first, run `list_voices` to browse the field; next, use `get_speaker` on that ID to fine-tune your requirements; then, feed everything into `create_tts_job`; and last, monitor the outcome with `get_tts_job`. You're managing the entire voice production pipeline conversationally through your agent.

For instance, if you need a very specific sound—say, an upbeat corporate announcer in Mandarin—you first run `list_voices` to find available profiles. Then, you pull up that profile using `get_speaker` so you can confirm its emotional range and what kind of stylistic controls it supports. Once confirmed, you submit your script via `create_tts_job`, making sure you set the speed parameter precisely how you want it, along with any required emotional style tags. The API returns a Job ID immediately. You'll then keep hitting `get_tts_job` until that status field flips from 'PROCESSING' to 'COMPLETED,' giving your agent the final link to download the finished audio file.

## Tools

### create_tts_job
Submits a text-to-speech synthesis job, returning a unique Job ID.

### get_tts_job
Checks the status of a previously submitted job using its Job ID, providing the final audio link when done.

### get_speaker
Retrieves detailed metadata for one specific voice ID to check its capabilities and parameters.

### list_voices
Provides an exhaustive list of all available voice profiles, allowing filtering by language or gender.

## Prompt Examples

**Prompt:** 
```
List all available English voices in Lovo AI.
```

**Response:** 
```
I've retrieved the list of voices. You have access to speakers like 'Sophia' (ID: 632...), 'Mateo' (ID: 634...), and many others. Would you like to filter them by style or gender?
```

**Prompt:** 
```
Create a TTS job for the text 'Welcome to the future of AI' using speaker ID 632c... with a cheerful style.
```

**Response:** 
```
The TTS job has been submitted successfully. The Job ID is 'job_789abc'. I will monitor the status for you. Would you like me to check if it's finished now?
```

**Prompt:** 
```
Check the status and get the audio link for job ID job_789abc.
```

**Response:** 
```
The job 'job_789abc' is now complete! You can access your audio file here: [Audio URL]. The synthesis used the 'cheerful' style as requested.
```

## Capabilities

### Discover available voices
Run `list_voices` to get a catalog of all available AI speakers, filtering by language or gender.

### Check voice details
Use `get_speaker` to pull deep metadata on a specific speaker ID so you can nail the perfect tone for your content.

### Create audio files
Run `create_tts_job` to submit text and parameters (like speed or emotional style) and get an active Job ID back.

### Track job status
Use `get_tts_job` with a Job ID. This tells you if the audio is still processing, failed, or ready to download.

## Use Cases

### Building a multilingual podcast series
The user needs five versions of a script in different languages. Instead of manual API calls, they ask their agent to first run `list_voices` to confirm all necessary language packs are available. Then, the agent loops through and runs `create_tts_job` for each version, managing multiple jobs concurrently.

### Automating ad copy generation
A marketing team needs 20 different audio versions of an ad script using a professional male voice. They run the text through `create_tts_job` and then continuously poll status using `get_tts_job`, getting confirmation when all 20 assets are ready for download.

### Creating character dialogue for a game
A developer needs to ensure the voice used for the 'villain' NPC has an overly dramatic tone. They first call `get_speaker` using the villain's ID to verify its emotional range, then use that data when calling `create_tts_job`.

### Verifying speech options before committing
Before submitting a massive job, the user wants to know which voices are available in Mandarin. They start by running `list_voices`, filtering strictly for that language, preventing them from selecting an unsupported profile.

## Benefits

- You don't just get generic audio. By checking the speaker details with `get_speaker`, you ensure the voice has the precise tone (e.g., cheerful, sad) needed for your script.
- The process is controlled. You initiate jobs using `create_tts_job` and then use `get_tts_job` to track it until the final audio URL appears—no guessing games here.
- It gives you full voice control. Use `list_voices` to browse hundreds of premium speakers, letting you select exactly the language or style your content needs.
- You bypass manual file management. Everything happens in a single flow: send text, get ID, poll status. Your agent handles the whole queue.
- It works across platforms. You can generate video narrations for podcasts or ad copy directly from your script-writing interface via any MCP client.

## How It Works

The bottom line is: you tell your agent what to say, which voice to use, and when the job's done, all through simple API calls.

1. First, run `list_voices` to browse and select the voice you want.
2. Next, use your agent to call `create_tts_job`, supplying the text, speaker ID, and desired emotional style.
3. Finally, wait a minute or two, then check status using `get_tts_job` until it confirms the final audio URL is available.

## Frequently Asked Questions

**How do I find the right voice using Lovo AI (Genny TTS & Voice Synthesis API)?**
Run `list_voices` first. This gives you a full catalog. If you narrow down a speaker, use `get_speaker` to verify its specific capabilities and tones before committing.

**What happens if my TTS job fails after using create_tts_job?**
The server will return an error code or status message. You'll need to review the error details provided by `get_tts_job` and adjust your source text or parameters.

**Can I change the voice after running create_tts_job?**
No, you can't edit a job in progress. You have to run `create_tts_job` again with the new speaker ID and updated text.

**How do I make sure my agent uses American English voices?**
Use `list_voices` and filter by language or check a specific voice's metadata using `get_speaker`. This confirms regional support before generating content.

**What do I need to provide when using the `create_tts_job` tool?**
You must supply your Lovo Genny API Key for the agent to connect. This key authenticates all requests, allowing the system to process the text-to-speech job and assign a unique Job ID.

**How do I check if my audio file is ready after calling `get_tts_job`?**
The response status will change from 'Processing' to 'Complete.' Once marked complete, the tool returns a direct URL. You use this link to access and download your finalized audio asset.

**Are there rate limits when running multiple jobs using `create_tts_job`?**
Yes, Lovo AI enforces usage caps on job submissions. If you exceed the allowed volume, your agent will receive an error code (429). You'll need to pause and wait for the quota to reset.

**What specific metadata can I retrieve using the `get_speaker` tool?**
You get more than just a name. The tool returns detailed specs like supported emotional styles, language codes, and pitch ranges. This data helps your agent choose the perfect speaker for niche content.

**How can I find the right voice ID for my project?**
Use the `list_voices` tool. It returns a comprehensive list of speakers including their IDs, names, and supported styles. You can then use `get_speaker` with a specific ID to see more detailed information.

**Can I adjust the emotion or speed of the generated voice?**
Yes! When using `create_tts_job`, you can provide an optional `speed` (number) and `style` (e.g., 'cheerful', 'sad', 'normal') to customize the output to your needs.

**How do I get the final audio file once the job is submitted?**
After creating a job, use the `get_tts_job` tool with the returned Job ID. Once the status is 'completed', the response will include the URLs to download your synthesized audio.