Sonix MCP. Process any audio or video file from start to finish.

Q: How do I get word-level timestamps using Sonix MCP Server?

Use the gettranscriptjson tool. This provides a detailed JSON output that includes specific start and end times for every single word, which is critical for advanced indexing.

Q: Is Sonix MCP Server good for organizing my media files?

Yes. You can use listmedia to see what you have, then run createfolder, and finally use updatemedia to correctly tag or move assets into the right spot.

Q: How do I process multiple files at once with Sonix MCP Server?

You use createbatchsummarization. This tool accepts a folder ID and runs the summarizer across all media inside it, giving you one centralized result.

Q: Can I update my user roles using Sonix MCP Server?

Yes. Use the updateuser tool to change a team member's permissions or role within your Sonix workspace.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Just plug in your AI agents and start using Vinkius.

Sonix connects media processing power—transcription, translation, and summarization—directly into your AI workflow. Submit audio or video files; get structured text transcripts (SRT, VTT, JSON), create multi-language translations, run batch summaries on whole folders, and even burn subtitles onto video for social sharing.

Your agent handles the entire media pipeline.

What your AI agents can do

Create batch summarization

Generates a summary for all media files within a specified folder.

Create folder

Creates a new, empty directory to organize your media library.

Create media export

Initiates the process of creating and exporting media files.

+ 27 more capabilities included

Transcribe Media

You can get plain text transcripts, or detailed formats like SRT/VTT/JSON that include word-level timestamps and speaker labels.

Summarize Content

Run AI summaries on individual files or process a whole folder of media at once with batch summarization tools.

Translate Transcripts

Automatically convert transcripts into dozens of different languages using the create_translation tool.

Manage Files and Folders

Organize your media library by listing, creating, updating, or deleting folders and files (list_folders, delete_media).

Prepare Videos for Sharing

Burn subtitles directly onto video files using the create_video_burn_in tool to make content ready for social media.

Control Access and Users

Manage team collaboration by listing users, inviting new members (invite_user), or setting up share links (create_share).

Ask AI about this MCP

Ask ChatGPT

Ask Claude

Ask Perplexity

Supported MCP Clients

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

+ other MCP clients

Free for Subscribers

Waiting for input…

AI Agent

Sonix MCP Server: 29 Tools for Media Operations

Use these tools to manage every stage of media content, from initial upload and transcription through international translation and final video export.

create019e5d57

create batch summarization

Generates a summary for all media files within a specified folder.

create019e5d57

create folder

Creates a new, empty directory to organize your media library.

create019e5d57

create media export

Initiates the process of creating and exporting media files.

create019e5d57

create share

Generates a secure link to share a specific media file with another user.

create019e5d57

create summarization

Runs an AI summary on a single, specified media file.

create019e5d57

create translation

Translates the transcript of a specific media file into a target language.

create019e5d57

create video burn in

Adds subtitles directly onto a video track, creating a final video export.

delete019e5d57

delete media

Permanently removes a media file from your Sonix library.

delete019e5d57

delete share

Removes an existing share link for a media file.

get019e5d57

get batch summarization

Retrieves the status and details of a batch summarization job.

get019e5d57

get media

Checks the current status and basic details of any media file ID.

get019e5d57

get media export

Gets the progress status for a media export job.

get019e5d57

get summarization

Retrieves the completed summary text and details for a single file's summarization job.

get019e5d57

get transcript json

Fetches a detailed transcript in JSON format, including word-level timestamps and speaker identification.

get019e5d57

get transcript srt

Retrieves the full transcript formatted as an SRT file for subtitle use.

get019e5d57

get transcript text

Gets a simple, plain text version of the entire audio or video script.

get019e5d57

get transcript vtt

Retrieves the full transcript formatted as a VTT file for web display.

get019e5d57

get translation

Checks the status and retrieves the translated text from a media file's translation job.

get019e5d57

get video burn in

Retrieves the status of a video burn-in process (subtitles being applied to the video).

invite019e5d57

invite user

Invites a new team member by email address to your Sonix account.

list019e5d57

list folders

Displays a list of all folders currently in your media library.

list019e5d57

list media

Returns a list of all uploaded and managed media files, including their IDs.

list019e5d57

list shares

Shows which users or groups currently have access to a specific media file.

list019e5d57

list users

Retrieves a list of all user accounts associated with the Sonix workspace.

split019e5d57

split transcript

Automatically takes a full transcript and splits it into subtitle chunks.

submit019e5d57

submit media

Uploads new media (audio/video) to the server queue for transcription or analysis.

update019e5d57

update folder

Modifies attributes of an existing folder, like renaming it.

update019e5d57

update media

Changes metadata associated with a specific media file.

update019e5d57

update transcript

Allows editing of transcript details, such as correcting words or speaker labels.

update019e5d57

update user

Changes the role or permissions level of a team member's account.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Sonix, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 4,700+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

What you can do with this MCP connector

Sonix manages your entire media pipeline—everything from raw audio files you upload to polished, multilingual video content ready for sharing. Your agent handles all this heavy lifting. When you use Sonix, you're working with a system that lets you ingest new media using submit_media, and then manage the resulting assets by checking their status via get_media or listing them all through list_media.

Media Transcription and Formatting

When you process audio or video, Sonix doesn't just give you a transcript; it gives you options for every use case. You can get a simple script using get_transcript_text, which provides plain text of the entire audio or video content. If you need subtitles for a website, use get_transcript_vtt to pull a VTT file.

For compatibility with standard subtitle players, fetch an SRT format using get_transcript_srt. When word-level detail and speaker identification are critical, grab a detailed JSON transcript via get_transcript_json. You can also automatically take a full script and break it into smaller chunks using split_transcript, or if you need to fix errors in the text itself, use update_transcript.

Content Analysis: Summarization and Translation

Need a quick digest of long content? You've got two options. For single files, run an AI summary with create_summarization. If you're dealing with an entire folder full of recordings, use create_batch_summarization to process them all at once. After kicking off a job, don't sweat it; check the progress and retrieve results for individual jobs using get_summarization, or track group work through get_batch_summarization.

For translation, you call create_translation on a file’s transcript, setting your target language. You then monitor the status and grab the finished text using get_translation.

Video Preparation and Sharing

Getting content out there is where Sonix shines. If you've created subtitles and need them burned directly onto the video track for social media, run create_video_burn_in. You can check that process's status with get_video_burn_in. When your files are ready to go, initiate a final export using create_media_export, then monitor its progress by calling get_media_export.

To share content without giving away the whole library, you generate secure links via create_share and manage those access points with delete_share. You can also see who's already looking at a file by running list_shares.

Organizing Your Library and Managing Users

Keeping your media organized is simple. Use list_folders to see every directory you own, or create new ones using create_folder. If things change, you can modify an existing folder's attributes with update_folder. You maintain control over the files themselves; list all assets with list_media, and if a file is junk, permanently delete it using delete_media.

Similarly, you update metadata on any file using update_media.

Collaboration means controlling access. To add teammates, use invite_user with their email address. You can see who's already part of the workspace by running list_users, and if someone changes roles or permissions, you adjust it with update_user. If your team structure shifts, you can manage those details using get_media, which checks the basic status of any file ID.

How Sonix MCP Works

1 First, connect your Sonix account and provide your API key to the MCP Server.
2 Next, use your agent to identify the media file (e.g., list_media) and tell it what you need—like 'get a JSON transcript for this ID.'
3 Finally, the server processes the request asynchronously, providing status updates (get_transcript_json or get_summarization) until the final output is ready.

The bottom line is, your AI client acts like a dedicated media assistant, handling all file processing and organization without you ever leaving your chat window.

Who Is Sonix MCP For?

This server is built for people drowning in content. If your job involves taking raw audio or video recordings—be it interviews, lectures, or user feedback calls—and turning them into structured data, multilingual text, or polished social media clips, this is for you. It cuts out the manual copy-pasting and multi-tab management.

Content Creator

Needs to take a finished video clip and instantly generate subtitles (create_video_burn_in) and summaries so it can be posted across YouTube, TikTok, and blogs.

Journalist / Researcher

Receives hours of interview audio. Uses the server to submit media for transcription, then uses get_transcript_json to quickly search through all the text for specific names or dates.

Product Manager (PM)

Collects recordings of user feedback calls. Runs batch summarization (create_batch_summarization) on the whole week's worth of files, then shares the insights via automated links.

What Changes When You Connect

Get structured data instantly. Instead of just raw text, tools like get_transcript_json give you word-level timestamps and speaker labels—essential for analysis.
Handle global content effortlessly. Use create_translation to turn a single interview into 10 languages with one command, making your reach immediate.
Organize everything automatically. With tools like list_folders, create_folder, and update_media, you build a clean media library that never gets messy.
Streamline social posting. Don't just share the video; use create_video_burn_in to embed subtitles directly, making it ready for Instagram or TikTok without extra software.
Work on massive projects. Never process files one by one. Use create_batch_summarization to summarize an entire folder of research recordings in minutes.

Real-World Use Cases

Analyzing Hours of Interviews

A journalist receives 5 hours of raw interview footage. Instead of manually transcribing it, they use their agent to submit_media. Once processed, they run get_transcript_json to instantly query for specific quotes or names across the whole dataset.

Launching a Global Campaign

A content team records a video speech. They use create_video_burn_in first, then run create_translation on the transcript. The agent gives them subtitled videos in three languages, ready to post worldwide.

Product Feedback Review

A PM receives 20 audio calls from beta testers. They use list_media to check the files, then run create_batch_summarization on all 20. The agent collects and presents a single summary report of common pain points.

Maintaining an Archive

A corporate comms team needs to file away old assets. They use list_media to see what they have, then run create_folder and move related files into a structured folder before running update_media on the metadata.

The Tradeoffs

Copying text from browser tabs

The user manually transcribes a 30-minute podcast by listening and typing, then copies the resulting text into a spreadsheet for analysis.

→ Use submit_media to upload the file. Wait for it to process. Then use get_transcript_json so your agent gets structured data instantly—no manual transcription needed.

Trying to summarize everything at once

The user tries to pass 50 video files and a complex prompt to an LLM, causing the request to time out or generate low-quality summaries.

→ Use create_batch_summarization instead. This tool handles large collections efficiently and gives you status updates via get_batch_summarization.

Ignoring video formatting

The user gets a transcript but realizes it needs subtitles embedded for Instagram, so they have to use external video editing software.

→ Use the dedicated create_video_burn_in tool. It embeds subtitles directly onto the video file as part of the Sonix workflow.

When It Fits, When It Doesn't

You need this server if your source material is consistently media (audio, video) and you require structured output in multiple formats or languages. If your primary goal is just generating creative text, you're better off using a general-purpose LLM agent. However, if the analysis of that media—transcription, summarization, translation, or organization—is the core task, Sonix is necessary. For example, don't try to use get_transcript_text if you actually need speaker labels; run get_transcript_json. Also, remember that complex processes are asynchronous: always check status using tools like get_summarization or get_translation; never assume the result is instant.

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Sonix. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

How we secure it →

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 30 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Available Capabilities

create_batch_summarization create_folder create_media_export create_share create_summarization create_translation create_video_burn_in delete_media delete_share get_batch_summarization get_media get_media_export get_summarization get_transcript_json get_transcript_srt get_transcript_text get_transcript_vtt get_translation get_video_burn_in invite_user list_folders list_media list_shares list_users split_transcript submit_media update_folder update_media update_transcript update_user

The pain of media post-production isn't just time—it's switching context.

Right now, you record a podcast. You download the audio. Then you open one tab to transcribe it, another tab to generate subtitles (SRT format), and a third program to burn those subs onto the video. If you want it in Spanish, you have to re-export everything, then manually translate the text file, and repeat the whole process for every language.

With Sonix connected via MCP, your agent handles this entire sequence. You ask for a transcript, specify JSON format, tell it which languages to translate into, and even request subtitle burn-in—all in one conversation flow. The result is clean, structured content ready to publish.

Sonix MCP Server: Media Ops from the Chat

The manual steps that vanish include logging into separate platforms just for transcription, downloading multiple file types (VTT, SRT, TXT), and manually updating metadata across different tools. You don't have to juggle API keys in a developer console.

Your AI agent treats Sonix like a natural extension of your chat interface. It runs the heavy lifting—from `submit_media` to final export—and presents you with structured, actionable results directly back to you.

Common Questions About Sonix MCP

How do I get word-level timestamps using Sonix MCP Server? +

Use the get_transcript_json tool. This provides a detailed JSON output that includes specific start and end times for every single word, which is critical for advanced indexing.

Is Sonix MCP Server good for organizing my media files? +

Yes. You can use list_media to see what you have, then run create_folder, and finally use update_media to correctly tag or move assets into the right spot.

What's the difference between VTT and SRT transcripts? +

SRT is a standard subtitle timecode format used by players. VTT uses WebVTT syntax, which is preferred for embedding subtitles directly onto web pages.

How do I process multiple files at once with Sonix MCP Server? +

You use create_batch_summarization. This tool accepts a folder ID and runs the summarizer across all media inside it, giving you one centralized result.

Can I update my user roles using Sonix MCP Server? +

Yes. Use the update_user tool to change a team member's permissions or role within your Sonix workspace.

What credentials are required to run a task like `submit_media` using Sonix MCP Server? +

You must provide your Sonix API key during setup. Your AI client uses this key internally for every operation, including submitting new media files and running any tool within the server.

How does the `create_share` tool work to manage file access? +

It generates a unique share link for your media file. You control who sees it by managing permissions, and you can review existing links using the list_shares command.

What is the process flow when I use the `create_video_burn_in` tool? +

This tool renders subtitles directly onto the video frames. It takes a finished transcript and generates a new, ready-to-share media file that looks like social content.

Can I download subtitles for my videos in SRT format? +

Yes! Use the get_transcript_srt tool with your Media ID. You can also customize options like speaker_display and max_characters per line.

How do I translate an existing transcript to another language? +

Simply use the create_translation tool. Provide the media_id and the target language code (e.g., 'es' for Spanish) to start the automated translation process.

Is it possible to summarize multiple media files at once? +

Yes, use the create_batch_summarization tool. It allows you to submit multiple media IDs to generate AI summaries for all of them in a single operation.

Use it with your favorite AI tools

Connect this server to Cursor, Claude, VS Code, and more.

OpenAI Agents SDK sdk-python

Google ADK sdk-python

Pydantic AI sdk-python

Vercel AI SDK sdk-typescript