# Speechnotes MCP

> Speechnotes connects your AI agent to an industry-standard transcription engine. It handles everything from initiating a job using a remote URL to fetching usage statistics and exporting polished files in TXT, DOCX, or SRT formats. Manage the full lifecycle of professional audio transcription—monitor status, list history, and control language models, all through natural conversation with your AI client.

## Overview
- **Category:** industry-titans
- **Price:** Free
- **Tags:** transcription, speech-to-text, audio-processing, text-export, ai-transcription

## Description

Yo, you know how much time it sucks to deal with audio files? This server plugs Speechnotes right into your AI agent so you can handle high-accuracy transcription jobs without switching apps or writing a single API call. It manages the whole damn process—from kicking off the job using just an audio URL all the way through pulling out clean, exportable text.

When you use this, your agent handles everything. You send an audio file's URL, and it starts a new transcription job immediately. If you need to see how far along that thing is, you check the current processing state using a unique job ID. When the job's done, you can pull out the final text result, formatted exactly how you want—whether that's plain TXT, polished DOCX, or timed SRT.

Need to keep track of what you sent? You retrieve metadata and records for every past transcription job in one shot. For managing your account, you check your current API balance instantly using `get_remaining_credits` and pull detailed usage logs with `get_usage_statistics`. Before you start anything big, the agent can also list all supported language codes so you know what languages the engine handles. You'll even see a rundown of different AI models available for running transcriptions.

For advanced setup, your agent lets you list every delivery endpoint you've configured to receive webhook notifications and it can generate a cryptographic signature for any outgoing webhook payload. If you need to clean up history or manage job records, you can delete specific transcription jobs using their IDs. And hey, if you wanna know if the connection's solid, running `test_speechnotes_auth` gives you a quick verification check.

So basically, your agent starts the job from an audio URL; it lets you track status and review history; it exports the final text in TXT, DOCX, or SRT formats; and it handles all the billing checks—credits, usage, models, and languages. It's got the whole lifecycle locked down for ya.

## Tools

### remove_transcription_job
Deletes a specific transcription job record from your account history.

### get_remaining_credits
Checks your current API account balance and available transcription credits.

### get_transcription_export
Downloads the finished transcript text in a specified format (TXT, DOCX, SRT).

### list_transcription_models
Lists the different AI models available for running transcriptions.

### generate_webhook_signature
Creates a cryptographic signature for an outgoing webhook payload.

### get_transcription_status
Checks the current processing state of a transcription job using its unique ID.

### get_usage_statistics
Retrieves detailed logs and metrics about how your account has been used over time.

### list_transcription_history
Retrieves metadata and records for all past or completed transcription jobs.

### list_supported_languages
Fetches a list of language codes supported by the transcription engine.

### list_configured_webhooks
Lists all the delivery endpoints you have set up to receive webhook notifications.

### test_speechnotes_auth
Runs a quick check to verify that the connection and API credentials are working correctly.

### transcribe_audio_url
Sends an audio file URL to start a new, high-accuracy transcription job.

## Prompt Examples

**Prompt:** 
```
Transcribe the audio file at this URL: 'https://example.com/interview.mp3'.
```

**Response:** 
```
Transcription job started! I've triggered the process for your audio file. You can check the status using the provided Job ID: SN-789.
```

**Prompt:** 
```
Transcribe the latest team meeting recording and generate a summary with action items.
```

**Response:** 
```
Transcription complete. Meeting: "Engineering Weekly" (47 minutes, 6 participants). Accuracy: 96.2%. Summary: Sprint 13 review (8 stories completed, 2 carried over), API performance discussion (latency reduced 30%), new hire onboarding plan, Q3 roadmap preview. Action items: 1) Sarah: fix authentication bug by Friday. 2) James: prepare load testing plan by Monday. 3) Alex: schedule architecture review for database migration. 4) Lisa: update API documentation for v3 endpoints. 5) David: onboard new engineer (start date May 19). Word count: 5,670. Speakers identified: 6.
```

**Prompt:** 
```
Show me all transcriptions from the past week with their word counts and language detection.
```

**Response:** 
```
12 transcriptions last week. Total audio processed: 8.4 hours. Languages: English (9), Spanish (2), Portuguese (1). Longest: "Board Meeting" (2h 15m, 14,500 words). Shortest: "Quick Standup" (8 min, 890 words). Average accuracy: 95.8%. "Sales Call - Meridian" (45 min, 4,200 words, EN). "Customer Interview" (30 min, 3,100 words, EN). "Marketing Brainstorm" (1h, 6,800 words, EN). "Team Retrospective" (35 min, 3,600 words, EN). 5 additional transcriptions. Storage used: 234 MB. Export formats available: TXT, SRT, VTT, DOCX.
```

## Capabilities

### Start Transcription Job from URL
Sends an audio file's URL to the server, beginning a new transcription process.

### Check Job Status and History
Retrieves real-time progress on running jobs or lists metadata for completed transcriptions.

### Export Transcribed Text
Downloads the final text result, formatted as TXT, DOCX, or SRT.

### Manage Account Usage
Checks remaining API credits and retrieves detailed usage statistics for billing purposes.

### Determine Language Support
Returns a list of supported languages and their corresponding codes for accurate transcription settings.

## Use Cases

### Processing a backlog of client interviews.
A marketing director has 30 audio files and needs transcripts. Instead of manually listing them, they tell their agent: 'Transcribe these 30 URLs.' The agent runs `transcribe_audio_url` for all links. It then waits, periodically calling `get_transcription_status`, until the job is finished. Finally, it calls `get_transcription_export` to get them all in a single ZIP file.

### Debugging an account billing issue.
The ops engineer suspects over-usage. They ask their agent: 'What did we process last month?' The agent runs `get_usage_statistics` and cross-references the total volume with what was expected, while also calling `list_transcription_history` to pinpoint which jobs were largest.

### Building a robust internal reporting system.
A development team needs transcription results to automatically update Jira tickets. They use the agent to run `transcribe_audio_url`. The server then triggers a webhook, and the agent uses `generate_webhook_signature` to ensure that incoming data payload is legitimate before processing it.

### Cleaning up old jobs and checking language needs.
A researcher finished a project and needs to clear old records. They first run `list_transcription_history` to see the IDs, then use `remove_transcription_job` on the unnecessary records. Before starting new work, they confirm locale support by running `list_supported_languages`.

## Benefits

- **Handles the full lifecycle.** You don't need separate calls for starting, checking, and exporting. Use `transcribe_audio_url` to start, then `get_transcription_status` until it finishes, and finally `get_transcription_export` all in one agent sequence.
- **Maintains clear accountability.** Always know your limits. Call `get_remaining_credits` before running any large jobs. Also check `get_usage_statistics` to see exactly where bandwidth is going.
- **Keeps data organized.** Need proof of past work? Use `list_transcription_history`. You get detailed metadata, timestamps, and speaker counts—it's better than a simple job list.
- **Supports complex workflows.** The server lets you manage webhooks (`list_configured_webhooks`) and generate signatures (`generate_webhook_signature`), allowing your agent to integrate transcription results into other systems automatically.
- **Optimizes accuracy on demand.** Before transcribing, run `list_supported_languages` or check available models via `list_transcription_models` to ensure the best engine is running for your specific content.

## How It Works

The bottom line is you talk to your AI client; the client talks to Speechnotes API tools; Speechnotes does the hard work and gives you text back.

1. Subscribe to the server and enter your Speechnotes API Key and Secret into your agent's configuration.
2. Tell your agent what you need (e.g., 'Transcribe this URL: [link]'). The agent uses `transcribe_audio_url` to initiate the job.
3. Wait for confirmation or ask the agent to check status using `get_transcription_status`. Once complete, request the export via `get_transcription_export`.

## Frequently Asked Questions

**How do I know if my API key is set up correctly with Speechnotes MCP Server?**
You run the `test_speechnotes_auth` tool. This confirms your credentials work before you waste time running a full transcription job.

**What if I need to transcribe an audio file in French? Which tool do I use?**
First, run `list_supported_languages` to get the correct code. Then, use that language setting when you call `transcribe_audio_url`.

**How can I check my account limits using Speechnotes MCP Server?**
Use `get_remaining_credits`. This tells you exactly how many transcription jobs or credits you have left before starting any large process.

**How do I check the real-time progress of a running job using the `get_transcription_status` tool?**
The `get_transcription_status` tool provides immediate updates on your audio file. It tells you whether the transcription is still processing, if there were errors, or when it's ready to export.

**When I use `get_transcription_export`, what are the available document formats for my text?**
The tool lists supported output file types. You can request results in multiple common formats, including plain TXT, DOCX, and SRT files.

**How do I review detailed metadata for all past jobs using `list_transcription_history`?**
Running `list_transcription_history` retrieves a full list of your completed transcriptions. This data includes timestamps and the total number of speakers identified for each job.

**How can I manage or verify my webhook endpoints using `list_configured_webhooks`?**
This tool lists all the delivery addresses connected to your account. It lets you confirm which external systems will receive automatic notifications when a transcription job finishes.

**If I need to clear out an old or failed record, how do I delete it using `remove_transcription_job`?**
You use the `remove_transcription_job` tool and provide the specific Job ID. This action deletes the job's record entirely from your history.

**How do I find my Speechnotes API credentials?**
Log in to your Speechnotes account and navigate to the API or developer section in your dashboard to find your unique API Key and API Secret.