Deepgram MCP for AI. Convert Audio to Text and Vice Versa.

Q: What is the difference between listavailablemodels and using them?

listavailablemodels just shows what models exist. You run a conversion job (like transcription) and specify which model name you want to use for that specific task.

Q: How do I check my API quotas using getprojectusage?

You simply ask your agent to run getprojectusage. It returns a simple breakdown of how many minutes and requests you've already used in the current cycle.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

Deepgram provides high-speed audio processing for your AI client. It handles speech-to-text transcription from URLs, generating accurate transcripts with speaker diarization.

You can also convert text back into natural-sounding audio using the Aura engine. This MCP lets you manage models, check project usage, and control API keys all through conversation.

What your AI can do

Get project usage

Checks the current API usage, including minute consumption and request counts for your Deepgram project.

List api keys

Retrieves all currently active identifiers associated with your deepgram projects.

List available models

Lists the names and details of high-performance STT and TTS models you can use for a job.

+ 3 more capabilities included

Transcribe Audio from a Link

Feed an audio or video URL into the MCP and receive a structured text transcript.

Generate Speech from Text

Pass plain text to the MCP, which returns a high-quality media file of spoken audio.

Check Usage Limits

Ask the MCP for current API usage and remaining minute consumption across your projects.

Manage Access Credentials

Retrieve active API key identifiers or list available Deepgram projects.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

Deepgram: 6 Tools for Audio Processing

These tools let you manage projects, check usage limits, list models, and execute both transcription and speech synthesis tasks via your agent.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Deepgram on Vinkius

Get Project Usage

Checks the current API usage, including minute consumption and request counts for your Deepgram project.

List Api Keys

Retrieves all currently active identifiers associated with your deepgram projects.

List Available Models

Lists the names and details of high-performance STT and TTS models you can use for a...

List Deepgram Projects

Retrieves a list of all deepgram projects linked to your account.

Convert Text To Speech

Generates a natural-sounding audio file when you provide it with plain text.

Transcribe Audio Url

Converts speech from an audio or video file provided via URL into structured text.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Deepgram integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "deepgram-alternative": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Deepgram tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"deepgram-alternative": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Deepgram, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Deepgram. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 6 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Handling Voice Data Manually Is a Time Sink.

Right now, processing recorded conversations means exporting the file, uploading it to a separate transcription service, waiting hours for credit checks, then downloading the resulting text file. Then you have to copy that data into your application and maybe run another script just to clean up timestamps.

With this MCP, your agent handles the whole chain. You give the URL, and the system automatically transcribes it with speaker diarization. The result hits your workflow as ready-to-use, structured text.

Generate Speech on Demand With Deepgram

Before this MCP, generating a voiceover meant writing the script, then exporting it to an expensive third-party TTS platform, paying per character, and downloading a ZIP of audio files. If you needed multiple versions, you repeated the whole cycle.

Now, your agent handles the synthesis job entirely. You pass the text, get the high-quality audio file back, and repeat that process instantly—no logins, no manual exports.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

Your agent needs to read audio files or generate voiceovers? Deepgram handles both sides of speech processing—transcribing audio into usable text and turning pure text back into natural-sounding speech. Forget manual uploads or juggling multiple services. Your AI client calls this MCP, and it manages the whole workflow for you.

You can take public video links and get a clean transcript back, complete with who spoke when (diarization). If you need voiceovers for videos, just send the text, and we generate the audio file. Need to know if your usage is spiking? Check the limits instantly. All this functionality lives in Vinkius, allowing your AI agent to access everything from model selection to project key retrieval using simple natural language commands.

Built · Hosted · Managed by Vinkius Deepgram MCP - Speech Transcription & Audio Generation

Server ID 019dd0de-137a-738f-bddf-196625557e29

Vinkius Inspector

Compliance Grade A+

Score 100/100

Report View Report ↗

Who is this actually for?

Engineers building agentic workflows, content teams needing scalable video assets, or research staff processing large amounts of interview data. If you deal with voice and text conversion regularly, this is for you.

DevOps Engineer

Needs to set up automated pipelines that pull speech data from external sources, check usage limits via get_project_usage, and ensure correct key rotation using list_api_keys.

Content Creator

Must automate the creation of subtitles or voiceovers for global video campaigns by feeding text to generate audio, then transcribing raw footage from URLs.

Research Scientist

Processes hours of interview recordings, using transcribe_audio_url to get detailed transcripts and monitoring project limits via list_deepgram_projects.

What Changes When You Connect

Transcribe complex audio: Use transcribe_audio_url to process recordings from public URLs, getting full transcripts with speaker separation (diarization).

Automate voice generation: Convert text into speech using convert_text_to_speech, eliminating the need for manual recording or studio time.

Control your credentials: Quickly list active keys with list_api_keys and check project status by running list_deepgram_projects through simple queries.

Stay within budget: Use get_project_usage to monitor API limits before a large batch of transcriptions, stopping you from hitting unexpected rate caps.

Select the right model: Before processing anything, run list_available_models to ensure your task uses the optimal high-performance AI engine.

See it in action

01 01

Processing a massive archive of interviews

The research team has 50 video files with recorded interviews. Instead of manually uploading them, they tell their agent to run transcribe_audio_url against the entire directory list. The MCP handles all 50 links and returns structured text for every single session.

02 02

Creating an automated tutorial video

The content team writes a script for a new product feature. They pass the final text to convert_text_to_speech, generating the voiceover audio. Then, they use that audio file as input for their deployment.

03 03

Debugging an API workflow

The engineer notices a transcription job fails and suspects bad permissions. They run list_api_keys to verify active credentials and check the project scope using get_project_usage before restarting the process.

04 04

Building a voice chatbot backend

The developer needs real-time speech input for an agent. They first use list_available_models to select a low-latency STT model, then connect that model via transcribing audio from a URL.

The honest tradeoffs

Treating Deepgram like a simple file uploader

Anti-pattern

A user tries to upload an MP4 directly into their agent, assuming the MCP will handle it just because they see 'audio' in the name.

The Fix

The transcribe_audio_url tool requires a public URL pointing to the audio/video file. You must provide that link; you can't pass a local file path directly.

Ignoring account limits

Anti-pattern

Running dozens of large transcriptions without checking resource constraints, leading to sudden API failures.

The Fix

Always check the current status and remaining capacity first. Use get_project_usage before initiating any high-volume transcription jobs.

Assuming one model works for all tasks

Anti-pattern

Running a transcript job using default settings when the audio is heavily accented or noisy, resulting in poor accuracy.

The Fix

First, call list_available_models to see which specialized models are available. Select the highest-accuracy option for your specific content type.

When It Fits, When It Doesn't

Use this MCP if your primary job involves converting between spoken word and written text—either transcribing audio from links or generating voiceovers from scripts. If you only need to write a script, use a basic text generation tool; don't worry about the transcription tools. If you have transcribed text but need to validate its structure against a schema, use a Pydantic-style validation MCP instead of using transcribe_audio_url. Remember: this is for audio and speech only.

Questions you might have

How do I use `transcribe_audio_url`? +

You provide a public URL pointing to the audio or video. The MCP then fetches that content and converts the speech into formatted text, giving you diarization details.

What is the difference between `list_available_models` and using them? +

list_available_models just shows what models exist. You run a conversion job (like transcription) and specify which model name you want to use for that specific task.

Does `convert_text_to_speech` require me to upload files? +

No, it just needs the text. You pass the plain string of characters directly to your agent, and the MCP handles generating the audio media file for you.

How do I check my API quotas using `get_project_usage`? +

You simply ask your agent to run get_project_usage. It returns a simple breakdown of how many minutes and requests you've already used in the current cycle.

How do I use `list_api_keys` to check my active Deepgram credentials? +

It retrieves a list of all current API keys tied to your account. This is essential for security, letting you verify which identifiers are authorized and ensuring you don't accidentally run jobs using deprecated or inactive keys.

What information does `list_deepgram_projects` provide? +

This function lists every project associated with your Deepgram account. You need this list to correctly reference a specific project ID when running complex operations, such as checking usage or transcribing audio for that defined scope.

When listing models using `list_available_models`, what criteria should I use? +

The tool returns model names and capabilities. You must check the descriptions to select a model optimized for your specific content—for instance, picking one that handles speaker diarization or particular accents will maximize accuracy.

Can `convert_text_to_speech` handle generating multiple audio files from different inputs? +

Yes. You provide the text and specify output parameters like voice type and format. By looping this call through your agent, you can efficiently generate large batches of synthetic speech assets for various use cases.

How do I get a Deepgram API Key? +

Log in to the Deepgram Console, navigate to the API Keys section, and create a new key with the necessary permissions.

What is the Nova-3 model? +

Nova-3 is Deepgram's latest state-of-the-art transcription model, offering unmatched speed and accuracy for real-world audio.

Can I synthesize speech in different voices? +

Yes! The convert_text_to_speech tool allows you to specify models like aura-asteria-en or aura-orion-en for different vocal profiles.

Connect to your AI in seconds.

Get project usage

List api keys

List available models

Deepgram: 6 Tools for Audio Processing

Make your AI actually useful.

Get Project Usage

List Api Keys

List Available Models

List Deepgram Projects

Convert Text To Speech

Transcribe Audio Url

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

Handling Voice Data Manually Is a Time Sink.

Generate Speech on Demand With Deepgram

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Processing a massive archive of interviews

Creating an automated tutorial video

Debugging an API workflow

Building a voice chatbot backend

The honest tradeoffs

Treating Deepgram like a simple file uploader

Ignoring account limits

Assuming one model works for all tasks

When It Fits, When It Doesn't

Questions you might have