Deepgram MCP for AI. Turn Speech Into Text, and Text Into Voice.
Works with every AI agent you already use
…and any MCP-compatible client








How this MCP server connects to your AI agent
Deepgram takes speech and turns it into text, or takes text and makes it sound like a person talking. This MCP connects your agent to high-speed neural networks for both transcription and synthesis.
You can feed in an audio file from any URL, get accurate transcripts with speaker separation, and automatically generate professional voiceovers using the Aura engine.
What AI agents can do with Deepgram Automation
Get project usage
Checks your current API usage and consumption limits for Deepgram projects.
List api keys
Retrieves a list of all active API keys associated with your account.
List available models
Provides a directory of the high-performance AI models supported for both transcription and synthesis.
The agent takes a link to an audio file and returns a high-fidelity, formatted text transcript.
You provide raw text, and the agent converts it into natural-sounding audio files.
The agent checks your API usage to show you exactly how many minutes or requests you’ve consumed.
You ask the agent for a list of active API keys and project identifiers.
Ask an AI about this
Waiting for input…
What AI agents can do with Deepgram: 6 Tools for Audio Processing
Use these tools to control every step of the audio workflow, from listing available models and checking usage limits to transcribing files and synthesizing speech.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using Deepgram on VinkiusGet Project Usage
Checks your current API usage and consumption limits for Deepgram projects.
List Api Keys
Retrieves a list of all active API keys associated with your account.
List Available Models
Provides a directory of the high-performance AI models supported for both...
List Deepgram Projects
Lists all deepgram projects you currently have access to.
Convert Text To Speech
Generates natural-sounding audio files from any text input (TTS).
Transcribe Audio Url
Transcribes spoken content by accepting a direct URL to an audio file.
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Deepgram, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,100+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Deepgram. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Built on the Model Context Protocol (MCP) for Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This connection provides 6 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.
The Headache of Audio Workflows, Solved with Vinkius AI Gateway
Today, handling audio is a mess of clicks. You record a meeting, then you download the file, open it in one program to get the transcript, copy that text into another system for cleanup, and if you need voiceovers, you have to upload the script somewhere else entirely. It's tedious context-switching just to turn sound into usable data.
With this MCP, your agent manages the whole lifecycle. Give it the audio URL, and it handles the transcription using transcribe_audio_url. Need a voiceover for that transcript? One command later, it generates the perfect TTS file. You get clean, actionable media assets without ever leaving your chat interface.
Deepgram MCP: Transcribe Audio and Generate Speech
You ditch manual uploads entirely. Instead of juggling multiple tabs and services just to get a transcript, you let the agent call transcribe_audio_url directly from a link or file path.
The difference is control. You tell your agent precisely what needs converting—a meeting recording, a short pitch, or an entire audiobook chapter. It's done instantly, reliably, and without any friction.
What your AI can actually do with this
Need to manage complex speech workflows without touching a command line? Connect your Deepgram MCP to give your agent full control over text-to-speech (TTS) and speech-to-text (STT). Your agent handles everything from taking an audio file via URL to spitting out formatted, accurate transcripts. You can even programmatically track how much you’ve used or check which API keys are active for reporting purposes.
This means your agent acts like a dedicated media production coordinator. Instead of manual uploading or complex setup in a web portal, you simply tell it what you need—transcribe this meeting recording, or make this paragraph sound like a professional voiceover. If you're building out an agency catalog, Vinkius makes sure all these powerful audio tools are accessible to any MCP-compatible client right from one place.
019dd0de-137a-738f-bddf-196625557e29 Here's how it actually works
The bottom line is that you use natural conversation to trigger complex, high-speed audio processing tasks.
First, connect your client to this MCP and retrieve your Deepgram API Key.
Next, tell your agent what you want: either provide an audio URL for transcription or supply text for voice synthesis.
The agent executes the task through Deepgram's engines and delivers the resulting formatted transcript or media file.
Who is this actually for?
Anyone working with spoken word data—from content producers needing voiceovers to engineers building transcription pipelines. If your job involves turning recordings into structured text or vice versa, you need this.
Automating the process of creating subtitles and voiceovers for video assets by feeding scripts into the TTS tool.
Integrating accurate speech-to-text processing directly into an application workflow, using the agent to monitor usage via get_project_usage.
Processing large batches of interview audio recordings programmatically and getting a full transcript with speaker separation details.
What Changes When You Connect
Transcribe meetings instantly. Instead of manually downloading audio files and running them through a separate service, use transcribe_audio_url to pass any URL directly to your agent for immediate transcription.
Create polished voiceovers on the fly. The convert_text_to_speech tool lets you generate professional-grade audio from simple text input, perfect for eLearning content or video narration.
Maintain security and budget control. Use get_project_usage to keep track of your minute consumption and request counts, preventing unexpected billing spikes across projects.
Understand the underlying tech. list_available_models gives you access to Deepgram's entire library of models, letting you pick the best balance between accuracy and speed for your specific content.
Simplify credential management. Rather than logging into multiple consoles, use list_api_keys through your agent to quickly retrieve identifiers needed for development work.
See it in action
Indexing Interview Data
A research team needs transcripts from 50 hours of recorded interviews. They ask their agent to run transcribe_audio_url on a batch of links, receiving fully formatted text with speaker diarization for every recording.
Building an Onboarding System
An engineer needs an automated voice component. They use convert_text_to_speech to generate instructional audio files from technical manuals, which are then served directly within the application UI.
Auditing API Costs
A developer wants to know if their staging environment is overspending. They prompt the agent with get_project_usage, getting an instant report on minute consumption and current limits without leaving their development console.
Creating Dynamic Video Assets
A content team needs 20 different voiceovers for a global campaign. They use convert_text_to_speech to generate all the audio files programmatically, ensuring consistent tone and high quality across languages.
The honest tradeoffs
Manual file uploads
The user has to download a large meeting recording (e.g., a .wav or .mp4) from Google Drive, then manually upload it into the Deepgram web portal and wait for processing.
Instead, just pass the direct link to your agent using transcribe_audio_url. Your agent handles the entire process automatically.
Assuming model compatibility
The developer tries to use a new or experimental AI feature without knowing if the underlying API supports it, leading to errors and wasted time.
Always check list_available_models first. This ensures your agent only attempts tasks with confirmed, high-performance models.
When It Fits, When It Doesn't
Use this MCP when your core problem involves converting spoken word into structured text or vice versa. If you are dealing with audio assets (recordings, podcasts, video clips) and need to extract the content, transcribe it, or generate a voiceover based on that content, this is your tool. Don't use it if you only need simple data lookup, like retrieving account names or listing databases; for those needs, look at general database connection tools. If your task requires complex audio manipulation beyond STT/TTS (like deep video editing or visual effects), this MCP won't help—you'll need dedicated media processing software instead.
Questions you might have
How do I transcribe an audio file using the Deepgram MCP? +
You use the transcribe_audio_url tool by providing a direct link to your audio or video file. Your agent then sends that URL to Deepgram's engines for high-fidelity text extraction.
Can I generate voiceovers from custom scripts with Deepgram MCP? +
Yes, you use the convert_text_to_speech tool by giving it the script. The agent returns a media file that you can use anywhere in your application.
What if my audio is too long for one request with Deepgram MCP? +
The system handles large files efficiently. You simply pass the URL, and the underlying Nova-3 models process it to deliver a complete transcript.
Where do I check my usage when using Deepgram MCP? +
You call get_project_usage. This tool gives you real-time data on your minute consumption and API limits, so you never run into unexpected overages.
We've already built the connector for Deepgram. Just plug in your AI agents and start using Vinkius.
No hosting. No infrastructure. No complex setup.
All 6 tools are live and waiting.
You're up and running in seconds.
Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.
Built, hosted, and secured by Vinkius. You just connect and go.