D-ID MCP. Generate talking avatars and digital videos via chat.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
D-ID MCP Server lets you build AI video content entirely through your chat agent. Generate talking avatars from scripts, sync avatars to pre-recorded audio, and manage digital presenters without leaving your IDE.
You can also check your D-ID credit balance and track job status directly from your AI client.
What your AI agents can do
Create clip
Generates a digital clip using a stock presenter and a background. No image upload is required.
Create talk
Creates a talking avatar video using text input. The avatar lip-syncs and speaks your script with natural expressions.
Create talk audio
Creates a talking avatar video by syncing the avatar's lips to a pre-recorded audio file you provide.
The agent generates a talking avatar video. It uses your provided text, synchronizing the avatar's lips and expressions to the script.
The agent uses a pre-recorded audio file. It generates a talking avatar video that perfectly matches the pitch, timing, and boundaries of your audio.
The agent fetches a list of D-ID's stock presenters, giving you their IDs and names for immediate use.
The agent uploads a face image to D-ID servers, making it available as a custom source for your unique video content.
The agent tracks the status of a talk or clip. It reports if the job is created, started, done, or if there was an error, and provides the final result URL when complete.
The agent retrieves your current D-ID credit balance and associated plan information.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
D-ID MCP Server: 10 Tools for Video Production
Manage the entire video asset pipeline—from uploading faces to generating finished clips—all through conversation with your AI agent.
019d7580create clip
Generates a digital clip using a stock presenter and a background. No image upload is required.
019d7580create talk
Creates a talking avatar video using text input. The avatar lip-syncs and speaks your script with natural expressions.
019d7580create talk audio
Creates a talking avatar video by syncing the avatar's lips to a pre-recorded audio file you provide.
019d7580delete talk
Deletes a specific talk job that was previously created.
019d7580get clip
Checks the status of a generated clip and provides the result URL once the process is finished.
019d7580get credits
Retrieves your current D-ID credit balance and plan details.
019d7580get talk
Checks the status of a talk job. It reports if it's done or gives the final result URL when the video is ready.
019d7580list presenters
Fetches a list of all available D-ID stock presenters, including their IDs and preview images.
019d7580list talks
Retrieves a list of all your past and current talk jobs, showing their IDs and creation timestamps.
019d7580upload image
Uploads a face image to D-ID servers, making it available as a source for a custom avatar.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with D-ID, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
You connect your D-ID account to your agent and run your whole video content workflow through chat. You'll generate talking avatars and manage digital presenters without ever leaving your IDE.
create_talk generates a talking avatar video using text input. The avatar lip-syncs and speaks your script with natural expressions. create_talk_audio makes a talking avatar video by syncing the avatar's lips to a pre-recorded audio file you give it. You can also create_clip to generate a digital clip using a stock presenter and a background; you don't even need to upload an image for that one.
list_presenters pulls up a list of all available D-ID stock presenters, giving you their IDs and preview images. You can also upload_image to upload a face image to D-ID servers, making it a custom source for your own video content.
If you need to check on a job, get_talk checks the status of a talk job, reporting if it's done or giving you the final result URL when the video is ready. get_clip checks the status of a generated clip and gives you the result URL once the process wraps up.
You can also list_talks to retrieve a list of all your past and current talk jobs, showing their IDs and creation timestamps. delete_talk deletes a specific talk job you created.
To keep tabs on your account, get_credits retrieves your current D-ID credit balance and plan details. You can also list_talks to see a list of all your past and current talk jobs, showing their IDs and creation timestamps.
How D-ID MCP Works
- 1 Subscribe to the server and enter your D-ID API Key (found in D-ID Studio > Account Settings).
- 2 Tell your AI client what you want. For example: 'Create a talking avatar using the script: [Your text]'.
- 3 Your AI client calls the appropriate tool. The agent processes the request, generates the video, and returns the status or result URL.
The bottom line is, you manage complex video generation processes by talking to your AI client, not by visiting a website.
Who Is D-ID MCP For?
Content creators, marketing managers, and product teams need this. If you're constantly juggling video editing software, manually managing API keys, or waiting for job status on a dashboard, this is for you. You get to build out full content workflows using only natural language prompts.
Needs to quickly generate localized talking head videos for different markets. They use the agent to feed scripts and select presenters, reducing the time spent in external video editors.
Uses the agent to prototype digital human interactions quickly. They test how a digital presenter reacts to different scripts or background changes before involving a developer.
Tests and debugs the D-ID video generation pipeline. They call tools like create_talk and get_talk directly from the chat to verify outputs and map presenter IDs.
What Changes When You Connect
- Stop switching tools. You generate personalized video messages and social media content using
create_talkwithout ever leaving your chat client. - Control your entire video budget. Use
get_creditsto check your D-ID credit balance and know exactly how much video content you can generate. - Fast content iteration. Instead of finding presenters manually, run
list_presentersto get IDs and backgrounds, letting you test multiple concepts quickly. - Build complex videos in stages. Use
create_talkfor the main script, thencreate_clipto pull out highlight moments, thenupload_imagefor supporting visuals. - Save time tracking jobs. Instead of refreshing a dashboard, use
get_talkorget_clipto ask your agent the status of a job and get the result URL when it's ready. - Work with your own faces. Use
upload_imageto feed your face into the system, creating custom avatars for truly personalized content.
Real-World Use Cases
Localizing marketing videos
A marketer needs to create the same product announcement for three different international markets. Instead of recording three separate shoots, they ask the agent to run create_talk three times with different scripts and voices. The agent handles the unique TTS requirements and generates the finished videos for each locale.
Prototyping a character interaction
A product team wants to see how a customer service bot might interact with a user. They use list_presenters to select a stock human, then use create_talk with a sample dialogue. This lets them verify the digital human's response and flow before coding the actual bot.
Archiving and managing content assets
A video editor needs to keep track of all generated content. They use list_talks to get a list of all job IDs and creation dates. Then, they use get_talk to check the status of a specific ID and retrieve the final URL for archival.
Creating video from existing audio recordings
A podcaster has a recorded interview segment and wants to turn it into a talking-head video. They use create_talk_audio, feeding the audio file to the agent. The agent generates a talking avatar that lip-syncs perfectly to the original audio.
The Tradeoffs
Doing it all on the D-ID website
The user opens the D-ID website, logs in, navigates to the 'Creator' tab, uploads the script, and clicks 'Generate'. This requires logging in, navigating tabs, and manually managing the API key in a separate environment.
→
Just tell your agent to run create_talk with the script. The agent handles the API key and the entire generation flow within the chat, skipping the manual website steps.
Forgetting job status
The user runs create_talk and leaves the chat. They assume the video is ready and try to use the result URL immediately, leading to a 404 or 'processing' error.
→
After running create_talk, follow up by asking the agent to run get_talk with the returned ID. You wait for the status 'done' and then retrieve the result URL.
Uploading images manually
The user uploads a face image to the D-ID dashboard, copies the URL, and then manually pastes it into a separate video generation tool.
→
Use the upload_image tool. Pass the image to the agent, and it handles the upload and provides the usable image URL for subsequent video creation.
When It Fits, When It Doesn't
Use this server if your workflow requires generating video content based on text scripts, audio files, or custom avatars. The core value is making the video pipeline accessible through a conversation. Don't use this if your primary need is complex video editing (e.g., color grading, adding music beds, or multi-track mixing); you'll need professional video software for that. If you only need to check simple metadata (like listing names), the list_presenters tool handles that fine. But if you need to manage the whole lifecycle—from concept (text input) to output (final URL)—this is the tool you need.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by D-ID. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 10 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Building video content means jumping between a dashboard, an editor, and an API client.
Today, creating a talking-head video is a mess of steps. You start in the D-ID dashboard to write the script. Then you have to select a presenter, upload a background, and click 'generate.' If you want to change the script, you repeat the whole process. If you want to know the status, you have to open a different tab and check a job ID.
With the D-ID MCP Server, you talk to your agent. You tell it, 'Make a talking video using this script and the 'Amy' presenter.' It handles the presenter ID, the script input, and the entire generation flow in one chat exchange. You get the status and the final result URL, period.
D-ID MCP Server: Generate video ops from chat.
You eliminate the need to manually navigate the D-ID website for every task. Listing presenters is a chat command. Checking credits is a chat command. Generating a clip is a chat command.
The whole process becomes conversational. You don't manage the UI; your agent does. It's a direct, actionable flow right where you're already working.
Common Questions About D-ID MCP
How do I check my D-ID credit balance using the get_credits tool? +
Run the get_credits tool name. The agent immediately returns your current credit balance and plan type. This lets you know if you have enough quota before starting a large generation job.
Can I use my own face for the create_talk tool? +
Yes. First, use upload_image to upload your face photo. Then, pass the resulting image URL to the agent when you call create_talk. The agent uses your photo as the source avatar.
What if my video job is stuck, how do I check the get_talk tool? +
Use get_talk with the job ID. The agent tells you the current status—is it 'started,' 'created,' or 'done'? If it's 'done,' it gives you the final result URL.
How do I make a video from an audio file using create_talk_audio? +
You upload the audio file and run create_talk_audio. The agent handles the sync process, ensuring the avatar's lips move perfectly to match the audio you provided.
What is the difference between create_clip and create_talk? +
create_talk generates a full video based on a script. create_clip generates a shorter highlight clip, usually using a stock presenter, and doesn't require you to supply a script or image.
How do I list all available presenters using the list_presenters tool? +
The list_presenters tool returns a roster of available digital humans. You get the presenter IDs and names, which you then use in create_clip or create_talk to specify who should appear in the video.
What should I do if I need to delete an old video using the delete_talk tool? +
Use the delete_talk tool with the specific Talk ID. This removes the job from your D-ID account, freeing up resources and cleaning up your project history.
Can I check the status of all my past video jobs with the list_talks tool? +
Yes, the list_talks tool retrieves a list of all your talks. It returns the IDs, current statuses, and when the jobs were created, helping you keep track of everything.
Can my agent create a talking avatar using a custom voice ID? +
Yes. Use the 'create_talk' tool and specify the TTS provider (microsoft or amazon) and the exact voice ID. The agent will orchestrate the request to generate an avatar that speaks your script with that specific vocal identity.
How do I use a custom image as the source for my talking avatar? +
First, use the 'upload_image' tool with a publicly accessible URL of your face image. The agent will upload it to D-ID and return a new internal URL which you can then pass as the 'source_url' in 'create_talk'.
Can I check my remaining D-ID credits through the agent? +
Absolutely. Use the 'get_credits' tool. Your agent will pull your current balance and plan info directly from D-ID, helping you manage your video generation limits and quotas through natural conversation.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
Tuputech Moderation
Bring Tuputech's Advanced Anti-Spam and AI Evaluation endpoints to your server. Scan text, images, and audio automatically via AI.
Unsplash
Automate high-resolution photo searches via Unsplash — find images, browse collections, retrieve user portfolios, and get random inspiration directly from your AI agent.
Spiritme
Create AI-generated videos with digital human presenters that deliver personalized messages in multiple languages naturally.
You might also like
Optum Claims & Billing
Securely submit medical bills, track 837 EDI claims, and process UHG bank remittance advices.
Knack
Manage your Knack database — list objects, query records, and perform CRUD operations via natural language.
pgvector (Vector Database)
Run vector similarity searches, manage embedding tables, and build AI-powered retrieval pipelines — all directly inside your existing PostgreSQL database.