Vinkius
Play.ht

Play.ht MCP for AI. Convert Text to Professional Audio Files Fast

Claude Claude
ChatGPT ChatGPT
Cursor Cursor
Gemini Gemini
Windsurf Windsurf
VS Code VS Code
JetBrains JetBrains
Vercel Vercel
See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Play.ht (AI Voice Generation & TTS) MCP on Cursor AI Code EditorPlay.ht (AI Voice Generation & TTS) MCP on Claude Desktop AppPlay.ht (AI Voice Generation & TTS) MCP on OpenAI Agents SDKPlay.ht (AI Voice Generation & TTS) MCP on Visual Studio CodePlay.ht (AI Voice Generation & TTS) MCP on GitHub Copilot AI AgentPlay.ht (AI Voice Generation & TTS) MCP on Google Gemini AIPlay.ht (AI Voice Generation & TTS) MCP on Lovable AI DevelopmentPlay.ht (AI Voice Generation & TTS) MCP on Mistral AI AgentsPlay.ht (AI Voice Generation & TTS) MCP on Amazon AWS Bedrock

How this MCP server connects to your AI agent

Play.ht MCP Server turns plain text into professional audio files using a neural voice engine. It lets you discover available voices—like listing all language options—and then converts any block of text instantly.

You can also track long-running jobs, so your AI client knows exactly when the final MP3 or WAV file is ready to download.

What AI agents can do with Play.ht (AI Voice Generation & TTS) Automation

Convert tts

Turns input text into an audio file format (MP3 or WAV) using a specific voice ID and quality setting.

Get tts status

Checks the completion status of a running TTS job, requiring only the unique request ID for tracking.

Get voices

Retrieves a structured list of all available Play.ht voices, including their unique IDs, languages, and metadata.

List available voices

Retrieves a structured list of every voice Play.ht offers, including metadata like language and unique IDs.

Convert text to speech

Takes input text and converts it into an audio file format (MP3 or WAV) using a specified voice ID.

Check conversion status

Uses a unique request ID to check if the long-running TTS job is finished, pending, or failed.

Included with Plan

Waiting for input…

AI Agent

What AI agents can do with Play.ht (AI Voice Generation & TTS) MCP Server: 3 Tools for Audio

Use these three tools to manage the entire text-to-speech process, from discovering available voices to checking job status and generating final audio.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Play.ht (AI Voice Generation & TTS) on Vinkius

Convert Tts

Turns input text into an audio file format (MP3 or WAV) using a specific voice ID and quality setting.

Get Tts Status

Checks the completion status of a running TTS job, requiring only the unique request...

Get Voices

Retrieves a structured list of all available Play.ht voices, including their unique...

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Claude AI

1

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

2

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

3

Start a conversation

Open a new chat. The Play.ht integration is available immediately — no restart needed.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

  • Import from OpenAPI, Swagger, or YAML specs
  • Create Agent Skills with progressive disclosure
  • Deploy to edge with MCPFusion framework
  • Built in DLP, auth, and compliance on every call
  • Real time usage dashboard and cost metering
  • Publish to catalog or keep private
Start building

Make Your AI Do More

Start with Play.ht (AI Voice Generation & TTS), then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

  • Use this MCP plus 5,100+ others, all in one place
  • Add new capabilities to your AI anytime you want
  • Every connection is secured and compliant automatically
  • Track usage and costs across all your servers
  • Works with Claude, ChatGPT, Cursor, and more
  • New servers added to the catalog every week
Play.ht MCP server cover

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Play.ht. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Built on the Model Context Protocol (MCP) for Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 3 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Getting professional voiceover shouldn't require multiple manual signups and API keys., Solved with Vinkius AI Gateway

Right now, if you need a video narrated, you often have to jump through hoops: first, checking which voices are even available on the platform; second, finding the right endpoint for text submission; and third, figuring out how long you have to wait before the file is actually ready.

With this Play.ht MCP Server, that whole sequence gets contained. Your agent handles the three steps automatically: `get_voices` validates your choice; `convert_tts` starts the job; and `get_tts_status` monitors it until you get the final asset. It's clean.

The Play.ht (AI Voice Generation & TTS) MCP Server: 3 Tools for Audio Pipelines

Before this server, running an audio job meant managing multiple state flags and dealing with inconsistent API calls—sometimes the list of voices was separate from the conversion service, creating integration gaps.

Now, you treat voice generation as a single, reliable pipeline. You call `get_voices` for inputs, use `convert_tts` for outputs, and rely on `get_tts_status` to handle all the complexity in between. It just works.

What your AI can actually do with this

Look, this Play.ht MCP Server handles turning plain text into professional audio files using their neural voice engine. It's built to let your AI client do the heavy lifting—you just point it at the server.

To get started, you first need to check what voices are available. You'll use the get_voices tool; this calls up a structured list of every single voice Play.ht has in its library. It gives you metadata for each one, including their unique IDs and what languages they support. This is how you figure out which voice fits your project.

Once you've got the right Voice ID, you can actually generate the audio. You call convert_tts, feeding it three things: the text you want spoken, that specific Voice ID, and any parameters for quality or format. The tool then starts processing, turning that written script into either an MP3 or a WAV file.

Since these aren't instantaneous jobs, they run in the background. You don't just call convert_tts and assume you got the final file; it's a multi-step process. After submitting your text for conversion, the server gives you a unique request ID. This is key because it tells your AI client what to watch for next.

If you need to know if the audio job finished or if something went wrong, you use get_tts_status. You just pass that unique request ID into this tool, and it checks the status—it'll tell you if the job is still pending, if it's done, or if it failed. This lets your agent wait for confirmation before trying to pull down the final audio file.

It’s basically a three-step loop: first, check get_voices for options; second, run convert_tts with text and voice ID; and third, constantly monitor get_tts_status using the returned request ID until you can grab your finished MP3 or WAV file.

Using this setup means you don't have to manually manage API calls. Your AI client handles the whole sequence. It grabs the list of voices first, making sure it knows all the available IDs and languages before sending a single character of text for conversion. When that job is submitted via convert_tts, your agent gets back that unique tracker ID.

You can then feed that ID into get_tts_status repeatedly. This process keeps your workflow tight because you're never guessing if the file is ready; you just ask the server, and it tells you exactly where it stands.

If you need to build out video narrations, this handles everything from script text to final audio file download. Developers can integrate realistic speech synthesis directly into their own apps without having to deal with manual API scheduling or polling. Accessibility teams find it useful because they can quickly turn large documents or reports into clear, audible speech for people who rely on that format.

It's a complete pipeline: discover voices, submit text, track status, and get the file.

Built · Hosted · Managed by Vinkius Play.ht Voice Generation - TTS MCP Server
Server ID 019e5d47-afa7-7051-87b1-663bcfc37cc3
Vinkius Inspector
Compliance Grade A+
Score 100/100
Vinkius Inspector Badge — Score 100/100

Questions you might have

How do I find out what voices are available using get_voices? +

Call get_voices. This tool returns a list of all Play.ht voices, giving you details like the voice ID, language code, and gender for selection.

What is the difference between convert_tts and get_tts_status? +

convert_tts starts the audio generation job and returns a request ID. get_tts_status uses that exact ID to check if the conversion finished or if it's still pending.

Can I use convert_tts with an unknown voice? +

No. You must first run get_voices to retrieve a valid, active Voice ID. Passing an incorrect ID will cause the conversion job to fail immediately.

Does Play.ht (AI Voice Generation & TTS) MCP Server support WAV files? +

Yes. When using convert_tts, you can specify your desired output format, including MP3 and WAV, giving you control over the final asset type.

How do I authenticate my connection before using `convert_tts`? +

You must supply your Play.ht API Key and User ID when setting up this server. Your agent uses these credentials to authorize every call, ensuring you have permission to generate audio assets.

If a conversion fails, how do I debug the issue using `get_tts_status`? +

While get_tts_status tracks progress, if an error occurs, the returned status object will contain specific failure codes. Check these details to pinpoint why your transcription ID isn't completing.

What parameters can I pass to `convert_tts` for fine-tuning the audio output? +

You control quality levels (Draft through High) and speaking speed directly within the function call. This lets you precisely adjust the audio profile—like making it sound more formal or conversational—for your text.

Does `convert_tts` handle massive amounts of text, or is there a limit? +

For short bursts of copy, convert_tts works instantly. If you're processing large documents or high volumes, the system may queue requests. You must check on their progress using the unique ID provided by get_tts_status.

How can I find the right voice ID for my language? +

Use the get_voices tool. It returns a complete list of available voices, allowing you to filter by name, language, and gender to find the perfect match for your project.

Can I control the speed and format of the generated audio? +

Yes! When using convert_tts, you can specify the speed (from 0.5 to 2.0), the output_format (like mp3 or wav), and the quality level to suit your needs.

What should I do if a conversion takes a long time? +

For longer texts, use the get_tts_status tool with your transcription_id. This allows you to check if the audio is still processing or ready for download.

Built & Managed by Vinkius 30s setup 3 tools

We've already built the connector for Play.ht. Just plug in your AI agents and start using Vinkius.

No hosting. No infrastructure. No complex setup.
All 3 tools are live and waiting. You're up and running in seconds.

Vinkius runs on Claude Claude
Vinkius runs on ChatGPT ChatGPT
Vinkius runs on Cursor Cursor
Vinkius runs on Gemini Gemini
Vinkius runs on Windsurf Windsurf
Vinkius runs on VS Code VS Code
Vinkius runs on JetBrains JetBrains
Vinkius runs on Vercel Vercel
+ other MCP clients

Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.

Zero hosting required Full MCP catalog included Enterprise-grade security Auto-updated by Vinkius

Built, hosted, and secured by Vinkius. You just connect and go.