# Speechmatics MCP

> Speechmatics provides high-accuracy audio processing capabilities right in your agent. Transcribe massive amounts of audio files—whether they're podcasts or meeting recordings—into structured text. You can also convert any written script into natural, human-sounding speech using various voices (like Sarah, Theo, and Megan). It handles everything from batch transcription to job management, giving you full control over your audio pipelines.

## Overview
- **Category:** productivity
- **Price:** Free
- **Tags:** speech-to-text, transcription, text-to-speech, audio-processing, natural-language-processing, voice-synthesis

## Description

Dealing with raw audio is a massive headache for any workflow. Before this MCP, turning hours of recorded conversation or video content into usable text required specialized software and tedious manual exports. Now, your agent connects directly to Speechmatics through Vinkius, letting you handle advanced audio processing as part of a natural conversation. You can feed it an audio file—via URL or base64—and quickly start a batch transcription job. Need voiceovers for training videos? Just give it the text and tell it which high-quality voice to use. The system manages all the background work, monitoring your jobs until the transcript is ready for you to pull out in JSON or SRT format.

## Tools

### create_job
Starts a new process to transcribe an audio file from a provided source.

### create_temp_key
Generates secure, temporary API keys for client-side access management.

### delete_job
Removes a transcription job from the system if it was started accidentally or is no longer needed.

### generate_tts
Converts specified text into an audio file using high-quality, natural speech voices.

### get_job
Retrieves the current status and specific details for a single transcription job ID.

### get_transcript
Pulls the final, completed text or subtitle file associated with a finished job.

### get_usage
Checks your current billing consumption and usage statistics for the service.

### list_jobs
Shows a list of all recent transcription jobs you have submitted to the system.

## Prompt Examples

**Prompt:** 
```
Transcribe the audio file at this URL: https://example.com/audio.mp3
```

**Response:** 
```
I've started a new transcription job (ID: j123abc). I'll use the default English configuration. You can check its status using `get_job`.
```

**Prompt:** 
```
Generate an audio file of Sarah saying 'Welcome to the future of speech technology'.
```

**Response:** 
```
Generating speech with Sarah's voice... Done! You can now access the synthesized audio for the text provided.
```

**Prompt:** 
```
List my 5 most recent transcription jobs.
```

**Response:** 
```
Fetching your recent jobs... I found 5 jobs. The most recent one is 'Meeting_Notes.mp3' (ID: j987xyz) which is currently 'completed'.
```

## Capabilities

### Transcribe audio files
Submit large audio recordings and receive highly accurate written transcripts.

### Generate synthetic speech
Turn plain text into high-quality, natural-sounding voice audio using multiple character voices.

### Manage transcription jobs
Keep track of every processing task, listing recent activity and checking the status of ongoing jobs.

### Retrieve completed transcripts
Pull finished transcriptions in various formats like JSON or plain text for immediate use.

## Use Cases

### Indexing internal knowledge bases from calls
A Customer Success Manager has a pile of recorded support calls. They ask their agent to use create_job on all the audio files. The system transcribes everything, and then they pull the clean text using get_transcript, immediately feeding it into an indexed search database.

### Creating multilingual training materials
An e-learning developer needs to update voiceovers for a new module. They input the script and tell their agent to use generate_tts with Megan's voice, receiving a ready-to-use audio file instantly.

### Automating video subtitling
A content creator finishes recording a podcast episode. Instead of manually transcribing it, they ask their agent to use create_job on the MP3 URL and then pull the output using get_transcript in SRT format for immediate upload.

### Auditing usage costs
A team lead wants to know how much audio processing has occurred this month. They ask their agent to run get_usage, getting an instant report on account consumption without having to check a separate dashboard.

## Benefits

- Batch processing large files is simple. Use create_job to submit multiple hours of audio at once and handle the entire workload without complex scripting.
- You get professional voice quality for free. The generate_tts tool lets you turn any script into natural speech using voices like Sarah or Theo, perfect for e-learning modules.
- Monitoring is built in. You never have to worry if a job failed; list_jobs and get_job let your agent track every single step of the process.
- Output flexibility means less cleanup time. When you pull results with get_transcript, you can choose JSON, SRT subtitles, or plain text.
- It’s secure and auditable. The create_temp_key tool lets your team manage access credentials without exposing permanent API keys.

## How It Works

The bottom line is: you tell your agent what audio needs processing, and it handles the entire lifecycle from submission to retrieval.

1. First, your agent initiates a request by submitting the audio file (via URL or base64) to create a new job.
2. Next, you monitor the task status using list_jobs and get_job until the transcription is marked as complete.
3. Finally, you retrieve the finished text or subtitles using get_transcript to integrate it into your workflow.

## Frequently Asked Questions

**How do I transcribe a large podcast episode with Speechmatics MCP?**
You start by using create_job, providing the audio URL or base64. Your agent monitors its status until it's complete, then you use get_transcript to pull the final text.

**Can I generate subtitles with Speechmatics MCP?**
Yes. After a job finishes using create_job, you can retrieve the transcript using get_transcript and specify SRT format for subtitle files.

**Is there a way to track my spending on Speechmatics MCP?**
Absolutely. You use the get_usage tool anytime to check your account consumption statistics without leaving your current workflow.

**What is the difference between list_jobs and get_job using Speechmatics MCP?**
list_jobs shows a summary of all recent jobs you've run. Use get_job when you know the specific ID of one job and need detailed status updates on it.

**Do I need to manage API keys for Speechmatics MCP?**
Yes, but it’s easy. You can use create_temp_key to generate temporary credentials, keeping your main key secure while allowing controlled access for testing or specific integrations.