Lovo AI (Genny) MCP. Generate professional audio from text in a conversation.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
Lovo AI (Genny TTS & Voice Synthesis API) connects text to lifelike speech. Your agent uses this server to generate high-quality, controllable voiceovers for videos, podcasts, or ad copy using hundreds of premium voices and granular style controls.
What your AI agents can do
Create tts job
Submits a text-to-speech synthesis job, returning a unique Job ID.
Get speaker
Retrieves detailed metadata for one specific voice ID to check its capabilities and parameters.
Get tts job
Checks the status of a previously submitted job using its Job ID, providing the final audio link when done.
Run list_voices to get a catalog of all available AI speakers, filtering by language or gender.
Use get_speaker to pull deep metadata on a specific speaker ID so you can nail the perfect tone for your content.
Run create_tts_job to submit text and parameters (like speed or emotional style) and get an active Job ID back.
Use get_tts_job with a Job ID. This tells you if the audio is still processing, failed, or ready to download.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
Lovo AI (Genny) TTS & Voice Synthesis API: 4 Tools
Manage the entire text-to-speech lifecycle—from listing voices to submitting jobs and retrieving final audio links.
019e5d31create tts job
Submits a text-to-speech synthesis job, returning a unique Job ID.
019e5d31get speaker
Retrieves detailed metadata for one specific voice ID to check its capabilities and parameters.
019e5d31get tts job
Checks the status of a previously submitted job using its Job ID, providing the final audio link when done.
019e5d31list voices
Provides an exhaustive list of all available voice profiles, allowing filtering by language or gender.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Lovo AI (Genny TTS & Voice Synthesis API), then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
Lovo AI (Genny) TTS Server - Voice Synthesis API lets your agent transform plain text into lifelike speech. You'll use this server to generate high-quality, controllable voiceovers for anything—videos, podcasts, or ad copy—by leveraging hundreds of premium voices and granular style controls.
To get started, you first need a handle on what's available. Run list_voices if you wanna see an exhaustive catalog of all the AI speakers we got going. You can filter this massive list by language or gender to narrow down your search immediately.
Once you know which voices exist, you gotta pick the right one. Use get_speaker when you want deep metadata on a specific speaker ID. This pulls detailed information about that voice's capabilities and parameters; it's how you nail the perfect tone for your content before committing to an audio file.
When you're ready, you submit the actual text using create_tts_job. You pass in the source material along with specific synthesis parameters—things like desired speed or emotional style. This job submission doesn't give you the audio; it gives you a unique Job ID. Keep that ID safe because you need it next.
Because generating high-quality speech takes time, you don't get the file right away. You gotta use get_tts_job with your saved Job ID. This function checks the status of the job—it tells you if the audio is still processing in the background, if something failed and needs a retry, or if it’s finally ready for download.
The whole process flows like this: first, run list_voices to browse the field; next, use get_speaker on that ID to fine-tune your requirements; then, feed everything into create_tts_job; and last, monitor the outcome with get_tts_job. You're managing the entire voice production pipeline conversationally through your agent.
For instance, if you need a very specific sound—say, an upbeat corporate announcer in Mandarin—you first run list_voices to find available profiles. Then, you pull up that profile using get_speaker so you can confirm its emotional range and what kind of stylistic controls it supports. Once confirmed, you submit your script via create_tts_job, making sure you set the speed parameter precisely how you want it, along with any required emotional style tags.
The API returns a Job ID immediately. You'll then keep hitting get_tts_job until that status field flips from 'PROCESSING' to 'COMPLETED,' giving your agent the final link to download the finished audio file.
How Lovo AI (Genny) MCP Works
- 1 First, run
list_voicesto browse and select the voice you want. - 2 Next, use your agent to call
create_tts_job, supplying the text, speaker ID, and desired emotional style. - 3 Finally, wait a minute or two, then check status using
get_tts_jobuntil it confirms the final audio URL is available.
The bottom line is: you tell your agent what to say, which voice to use, and when the job's done, all through simple API calls.
Who Is Lovo AI (Genny) MCP For?
Content creators who need video narrations fast. Developers integrating high-fidelity speech into apps without manual work. Marketers needing localized audio for ad campaigns. If your workflow involves turning scripts into broadcast-ready audio, you're here.
Needs to add professional narration to a script quickly. They use the server to run create_tts_job for voiceover tracks.
Integrates speech synthesis into an application's backend. They rely on the full cycle: list_voices -> get_speaker -> create_tts_job.
Needs to generate localized audio for social media ads. They run jobs and use get_tts_job to confirm the final output URL is ready for review.
What Changes When You Connect
- You don't just get generic audio. By checking the speaker details with
get_speaker, you ensure the voice has the precise tone (e.g., cheerful, sad) needed for your script. - The process is controlled. You initiate jobs using
create_tts_joband then useget_tts_jobto track it until the final audio URL appears—no guessing games here. - It gives you full voice control. Use
list_voicesto browse hundreds of premium speakers, letting you select exactly the language or style your content needs. - You bypass manual file management. Everything happens in a single flow: send text, get ID, poll status. Your agent handles the whole queue.
- It works across platforms. You can generate video narrations for podcasts or ad copy directly from your script-writing interface via any MCP client.
Real-World Use Cases
Building a multilingual podcast series
The user needs five versions of a script in different languages. Instead of manual API calls, they ask their agent to first run list_voices to confirm all necessary language packs are available. Then, the agent loops through and runs create_tts_job for each version, managing multiple jobs concurrently.
Automating ad copy generation
A marketing team needs 20 different audio versions of an ad script using a professional male voice. They run the text through create_tts_job and then continuously poll status using get_tts_job, getting confirmation when all 20 assets are ready for download.
Creating character dialogue for a game
A developer needs to ensure the voice used for the 'villain' NPC has an overly dramatic tone. They first call get_speaker using the villain's ID to verify its emotional range, then use that data when calling create_tts_job.
Verifying speech options before committing
Before submitting a massive job, the user wants to know which voices are available in Mandarin. They start by running list_voices, filtering strictly for that language, preventing them from selecting an unsupported profile.
The Tradeoffs
Treating it like a single API call
Calling create_tts_job once and then expecting the audio file immediately. The job needs time to process, or you'll just get an error.
→
You gotta run three steps: 1) Use create_tts_job. 2) Wait a moment. 3) Run get_tts_job repeatedly until the status is 'complete'. This sequence works.
Using generic voice parameters
Just picking a speaker ID and running text, hoping for the right tone. You might end up with something that sounds flat or wrong.
→
Always run get_speaker first to pull deep metadata on the desired voice profile. That ensures you know exactly which emotional styles (like 'cheerful' vs. 'professional') are supported before running create_tts_job.
Assuming job IDs persist
If you lose the Job ID after submitting a task, you can't check on it later. The whole process stalls.
→
Make sure your agent captures and stores the Job ID returned by create_tts_job. You need that specific ID to run get_tts_job successfully.
When It Fits, When It Doesn't
Use this server if your primary goal is converting text into high-quality, controllable audio files. The core workflow requires three steps: discovery (list_voices/get_speaker), creation (create_tts_job), and polling (get_tts_job). Don't use it if you need to edit the source text after creating a job—you have to restart the whole thing. Also, don't rely on this for complex phonetic corrections; it handles general style and tone changes, but extreme linguistic nuance might require human review.
It’s perfect when your workflow is: 'Here's my script -> Pick a voice -> Make audio.' It fails if the process needs to be non-linear or highly stateful beyond simple status checking. Keep it for the repeatable, text-to-speech pipeline.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Lovo AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 4 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Turning scripts into audio used to mean jumping between three different tools.
Before having an agent handle this, you'd write your script in one app, export it to a document, copy the text into a TTS website, and then download the resulting MP3. If you needed ten versions, that was ten sets of copy-pasting and downloading.
Now, you just give your agent the text and say, 'Use this voice for this script.' The server handles everything: it pulls the right speaker details, runs `create_tts_job`, and keeps running `get_tts_job` until the final URL is ready. You get audio output without ever leaving your chat window.
Lovo AI (Genny) TTS & Voice Synthesis API MCP Server: Create professional voiceovers.
You don't have to manually switch between 'list voices,' then 'create job,' and finally checking a status dashboard. All the state management—the IDs, the waiting period, the final retrieval—happens automatically through your agent using these tools.
The result is predictable, high-fidelity audio output that matches complex emotional requirements (like 'sad' or 'professional'). It’s a finished product, not just a raw data stream.
Common Questions About Lovo AI (Genny) MCP
How do I find the right voice using Lovo AI (Genny TTS & Voice Synthesis API)? +
Run list_voices first. This gives you a full catalog. If you narrow down a speaker, use get_speaker to verify its specific capabilities and tones before committing.
What happens if my TTS job fails after using create_tts_job? +
The server will return an error code or status message. You'll need to review the error details provided by get_tts_job and adjust your source text or parameters.
Can I change the voice after running create_tts_job? +
No, you can't edit a job in progress. You have to run create_tts_job again with the new speaker ID and updated text.
How do I make sure my agent uses American English voices? +
Use list_voices and filter by language or check a specific voice's metadata using get_speaker. This confirms regional support before generating content.
What do I need to provide when using the `create_tts_job` tool? +
You must supply your Lovo Genny API Key for the agent to connect. This key authenticates all requests, allowing the system to process the text-to-speech job and assign a unique Job ID.
How do I check if my audio file is ready after calling `get_tts_job`? +
The response status will change from 'Processing' to 'Complete.' Once marked complete, the tool returns a direct URL. You use this link to access and download your finalized audio asset.
Are there rate limits when running multiple jobs using `create_tts_job`? +
Yes, Lovo AI enforces usage caps on job submissions. If you exceed the allowed volume, your agent will receive an error code (429). You'll need to pause and wait for the quota to reset.
What specific metadata can I retrieve using the `get_speaker` tool? +
You get more than just a name. The tool returns detailed specs like supported emotional styles, language codes, and pitch ranges. This data helps your agent choose the perfect speaker for niche content.
How can I find the right voice ID for my project? +
Use the list_voices tool. It returns a comprehensive list of speakers including their IDs, names, and supported styles. You can then use get_speaker with a specific ID to see more detailed information.
Can I adjust the emotion or speed of the generated voice? +
Yes! When using create_tts_job, you can provide an optional speed (number) and style (e.g., 'cheerful', 'sad', 'normal') to customize the output to your needs.
How do I get the final audio file once the job is submitted? +
After creating a job, use the get_tts_job tool with the returned Job ID. Once the status is 'completed', the response will include the URLs to download your synthesized audio.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
Tactile CRM
Connect your AI to Tactile CRM. Query companies, read contact details, and evaluate your sales opportunities and pipelines natively from the terminal.
VWO
Manage A/B tests, feature flags, and conversion goals on VWO — the leading experience optimization platform.
TheMealDB Alternative
Search recipes, browse ingredients, and discover meals from global cuisines via AI.
You might also like
SAMHSA Treatment Locator
Find substance use and mental health treatment facilities across the US using official SAMHSA locator data.
Route4Me
Connect your AI assistant to Route4Me to orchestrate complex fleet logistics, solve multi-stop route optimizations, and track real-time vehicle GPS directly via chat.
ClickHouse (Vector Search)
Manage vector embeddings and SQL via ClickHouse — list databases, execute SQL, and perform high-speed vector searches directly from any AI agent.