Speechnotes MCP for AI Agents. Turn any spoken recording into structured text.
Speechnotes transcribes audio files from URLs using a high-accuracy engine. Your agent handles the whole process: initiating job requests, checking progress in real time, gathering detailed transcripts, and exporting results directly into formats like DOCX or SRT.
Give Claude and any AI agent real-world access
Initiate a transcription job by providing an external URL pointing to an audio file.
Check the real-time status of any ongoing or past transcription job using get_transcription_status.
List all previous jobs to see details, including speaker counts and timestamps, through list_transcription_history.
Download the final transcript in multiple professional formats like DOCX or SRT using get_transcription_export.
Monitor your account spending by checking available funds with get_remaining_credits, or reviewing detailed usage logs via get_usage_statistics.
Ask an AI about this
Waiting for input…
What AI agents can do with Speechnotes: 12 Tools for Audio Job Management
These tools allow your AI client to manage every aspect of the transcription pipeline—from initiating jobs on remote URLs to exporting final formatted files and checking account usage.
Make your AI actually useful.
Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.
Start using Speechnotes MCPRemove Transcription Job
Deletes a completed or pending job record from your account history.
Get Remaining Credits
Checks how many transcription credits you have left on your Speechnotes account.
Get Transcription Export
Downloads the finished transcript in a specified file format (like DOCX or SRT).
List Transcription Models
Shows you all the available AI models for optimizing your transcription quality.
Generate Webhook Signature
Creates a cryptographic signature used to verify incoming data payloads.
Get Transcription Status
Checks the current progress of an active or pending transcription job.
Get Usage Statistics
Retrieves detailed logs showing how you've used your service over time.
List Transcription History
Lists all past transcription jobs, giving details like dates and speaker counts.
List Supported Languages
Provides a list of language codes the system can transcribe.
List Configured Webhooks
Displays the current endpoints set up for receiving automated data deliveries.
Test Speechnotes Auth
Runs a quick check to confirm your API connection is working correctly.
Transcribe Audio Url
Starts the main job by taking an audio file URL and sending it for transcription.
Security and governance baked right in.
Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on each call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Speechnotes, then connect any of our 5,200+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 5,200+ others, all in one place
- Add new capabilities to your AI anytime you want
- Connections are secured and governed automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog weekly
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Speechnotes. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on each call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
The manual struggle of turning sound into searchable text Solved with Vinkius AI Gateway
Think about the last time you had an interview or recorded a meeting. You download the audio file. Then, you have to upload it somewhere. The service processes it, and then you wait for an email that says 'It's ready.' Finally, you download the text, open Word, delete the timestamps, remove speaker labels, and copy-paste the whole mess into your article.
With this MCP, those steps disappear. Your agent handles everything by connecting to Speechnotes. You simply point it at the audio URL; it starts the job and keeps track of it. When it's done, you ask for the export, and the text—already structured and clean—comes straight to your workflow.
Speechnotes: Getting transcripts with get_transcription_export
Before this MCP, getting a usable transcript meant juggling multiple tools and manually cleaning up messy files. You'd download the raw text, then maybe use another service just to change it from CSV format to DOCX.
Now, you manage job creation via transcribe_audio_url and finalize output using get_transcription_export. The system handles the formatting control for you, giving you precisely the type of file—be it SRT or TXT—that your final destination requires.
What your AI can actually do with this
Need to turn hours of spoken content into clean, usable text? Speechnotes lets your AI client connect directly to a professional transcription engine. You simply tell your agent where the audio is—maybe it's a link from a podcast feed or a meeting recording—and the system handles the rest. It starts the job, monitors when it’s done, and then gives you the final text.
This MCP lets you list past jobs to see what was transcribed, check how many credits you have left with get_remaining_credits, or even initiate a brand new transcription using transcribe_audio_url. Because this entire process runs through Vinkius, your agent can manage all the complexity of audio-to-text conversion without you having to touch any dashboards.
You just chat naturally and get the finished file.
019dd164-d66f-71ba-8c19-0f81de191411 Here's how it actually works
The bottom line is: you feed it an audio link and get structured, clean text back without ever logging into a separate website.
First, connect your AI agent to the Speechnotes MCP and provide your API key credentials.
Next, tell your agent to initiate a transcription job by giving it the URL of the audio file you want processed.
Finally, wait for confirmation that the job is complete, then ask the agent to export the text in the format you need.
Who is this actually for?
Journalists who need to process multiple interviews daily, legal paralegals managing depositions, or content teams turning podcasts into articles. These roles are constantly bottlenecked by manual transcription and reformatting.
Turns raw audio recordings of guest interviews into clean manuscripts for show notes and blog posts.
Processes long deposition files, listing the complete history via list_transcription_history to verify timestamps or check specific segments.
Runs weekly transcription jobs from meeting URLs and exports them using get_transcription_export so the team can immediately start drafting articles.
What Changes When You Connect
Saves time by automating the initial job start: Instead of manually uploading audio, simply using transcribe_audio_url sends a remote link and kicks off the process instantly.
Provides full oversight with history checks: list_transcription_history lets you quickly review every past project's metadata, knowing exactly how many speakers were identified or what date it was completed.
Guarantees usable output formats: The get_transcription_export tool ensures your text isn’t just a blob of characters; it comes out as clean DOCX files ready for immediate publishing.
Manages costs proactively: Check your budget and usage logs with get_remaining_credits or get_usage_statistics so you never run into billing surprises mid-project.
Handles complex languages and models: You can use list_supported_languages to guarantee the engine is optimized for Spanish, French, or any specific dialect needed for accurate results.
See it in action
Turning a multi-hour board meeting into actionable notes
The operations team uploads a link to an internal Zoom recording. They use transcribe_audio_url, then check get_transcription_status every few minutes until it’s done. Finally, they run the agent to export the resulting text as DOCX for immediate distribution.
Archiving a client interview with full metadata
A journalist wants to archive 20 past interviews. They use list_transcription_history first, then use get_usage_statistics to prove the total volume of work done and confirm which languages were processed.
Processing multiple foreign language recordings
A research team gathers depositions in Spanish, Portuguese, and English. They check list_supported_languages before initiating the first job, ensuring the system can handle all required scripts accurately.
The honest tradeoffs
What to watch out for, and the recommended way to handle each one.
Manually checking progress
The user gets a transcript link and then has to manually visit a dashboard page every ten minutes to see if it's ready, wasting time and forgetting to check.
Instead, let your agent use get_transcription_status. You ask the agent to monitor the job using its ID; you don't have to keep refreshing anything.
Assuming file format compatibility
The user gets a transcript and assumes it will be plain text, only to find they have to manually clean up formatting, speaker labels, and timestamps.
Always confirm the required output format using get_transcription_export. You can specify if you need TXT, SRT, or DOCX right from the start.
Forgetting to track usage
The user runs several large jobs and only realizes they hit their credit limit when the job fails, leaving them stuck.
Before starting a big batch of work, run get_remaining_credits. This ensures you know exactly how much processing power you have left.
When It Fits, When It Doesn't
Use this MCP if your primary input is always an audio file or URL and the goal is structured text output. You need to manage the entire lifecycle: from initiation (transcribe_audio_url) through monitoring (get_transcription_status), to final export (get_transcription_export). Don't use it if you already have perfectly clean, typed articles; in that case, a general LLM agent is enough. If your input is text and you just need a summary or tone analysis, this MCP won't help. This tool is specialized for the audio-to-text pipeline, making it ideal for media production workflows.
Questions you might have
How do I start a new transcription job with Speechnotes MCP? +
You use transcribe_audio_url. You simply instruct your agent to run this tool and provide the URL of the audio file you want converted.
Can I check if my transcript is finished using Speechnotes MCP? +
Yes, you can use get_transcription_status. This allows your agent to poll the system and tell you exactly when the job moves from 'Processing' to 'Complete'.
What if I need to cancel a job? Does Speechnotes MCP have a tool for that? +
Yes, you can use remove_transcription_job. This lets your agent delete the record of a finished or pending job right from your chat interface.
How do I get a transcript in multiple formats with Speechnotes MCP? +
Use get_transcription_export. You just tell your agent which format you need—TXT, DOCX, or SRT—and it handles the file type control.
Does Speechnotes MCP track my usage and credits? +
Absolutely. You can check your remaining funds with get_remaining_credits, or review detailed consumption patterns using get_usage_statistics.