Inworld AI MCP for AI. Build characters that genuinely sound alive.

Q: How do I make an agent respond using a voice that sounds like me? (using clonevoice)

You use clonevoice by providing it with an audio sample of your speaking. The tool analyzes the phonetics and generates a new, unique voice ID that you can then reference when synthesizing speech.

Q: What is the best way to handle user audio input? (using transcribeaudio)

You send the raw audio file directly to transcribeaudio. The tool handles the conversion synchronously, delivering clean, text-based data that your agent can immediately process.

Q: How do I manage multiple character personalities? (using createrouter)

You set up a dedicated LLM Router using createrouter. This router acts as the central decision point, checking user input and directing the request to the appropriate logic path or personality module.

Q: Can I change my character's voice mid-conversation? (using updatevoice)

Yes. If you need to adjust a published asset, use updatevoice. This allows you to modify metadata or the underlying audio profile without having to delete and recreate the entire voice.

Q: Before building agent logic, how do I know which LLM models are available by using the listmodels tool?

Just call listmodels. This returns a current list of all foundational models that your agents can use. It's smart to run this first so you confirm exactly which model names will work with the router before building complex flows.

Q: If I need real-time audio feedback, should I use synthesizespeechstream or synthesizespeechsync?

You must use synthesizespeechstream. This method delivers the audio data in chunks as it's generated. That’s essential for keeping interactions responsive and eliminating noticeable delays in your agent's speech output.

Q: How do I clean up old or unused character logic using deleterouter?

You use the deleterouter tool, providing only the router ID. This completely removes the defined conversational path and its associated context from your workspace. It's a critical cleanup step to keep your account tidy.

Q: When my agent needs to generate text after routing through multiple steps, how does chatcompletions finalize the reply?

The chatcompletions tool executes the final logic defined by the LLM Router. It takes the entire conversational context and generates the ultimate text response. Think of it as the final step in a complex workflow, giving you the agent's intended words.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

How this MCP server connects to your AI agent

Inworld AI connects advanced voice synthesis, character routing, and cloning capabilities directly to your agent. Generate high-fidelity speech from text, clone voices using audio samples, or build complex conversational logic with LLM routers.

It's designed for creating lifelike NPCs and sophisticated multimodal agents.

What AI agents can do with Inworld AI Automation

Chat completions

Generates chat completions by running the request through a defined LLM Router.

Clone voice

Creates a new voice profile by analyzing and replicating an existing audio sample.

Create realtime call

Sets up a WebRTC connection to enable real-time, bidirectional voice communication with the agent.

+ 16 more capabilities included

Generate Speech from Text

Synthesize high-quality audio streams synchronously or in real time using advanced text-to-speech models.

Create and Manage Digital Voices

Clone a voice from an existing audio sample, or create a brand new voice by providing descriptive text prompts.

Orchestrate Agent Behavior

Build complex conversation flows using LLM routers to manage how the agent processes inputs and decides its next action.

Transcribe Audio Inputs

Convert audio files into plain text, making spoken user input immediately available for your agent's processing.

Ask an AI about this

Included with Plan

Waiting for input…

AI Agent

What AI agents can do with Inworld AI: 19 Tools

These tools cover the entire spectrum of voice processing, from cloning and synthesis to advanced conversation routing.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Inworld AI on Vinkius

Chat Completions

Generates chat completions by running the request through a defined LLM Router.

Clone Voice

Creates a new voice profile by analyzing and replicating an existing audio sample.

Create Realtime Call

Sets up a WebRTC connection to enable real-time, bidirectional voice communication...

Create Router

Builds and initializes a new LLM Router that manages how the agent processes...

Delete Router

Removes an existing LLM Router from your workspace.

Delete Voice

Permanently deletes a voice profile you have created or cloned.

Design Voice

Generates a unique, temporary voice preview based solely on a written text description.

Get Router

Retrieves the specific details and configuration of an LLM Router by its ID.

Get Voice

Fetches all metadata for a single voice profile using its unique identifier.

List Models

Shows you all the available Large Language Models the agent can use for processing.

List Routers

Lists every LLM Router currently set up in your workspace.

List Tts Voices

A deprecated function to list Text-to-Speech voices; use 'list_voices' instead.

List Voices

Retrieves a full catalog of all voice assets currently available in your workspace.

Publish Voice

Takes a draft or preview voice and makes it a permanent, usable asset within your...

Synthesize Speech Stream

Generates speech audio in real-time chunks for streaming playback to the user.

Synthesize Speech Sync

Creates a complete, finished speech file from text that can be played back instantly.

Transcribe Audio

Converts an uploaded audio file into plain text format in a single synchronous call.

Update Router

Modifies the logic and parameters of an existing LLM Router to change its behavior.

Update Voice

Makes changes to an existing voice profile, such as updating metadata or publication status.

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The Inworld AI integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "inworld-ai": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the Inworld AI tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"inworld-ai": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with Inworld AI, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Inworld AI. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Built on the Model Context Protocol (MCP) for Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 19 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

Getting the conversation to sound natural is hard., Solved with Vinkius AI Gateway

Today, if you build an agent that needs a unique voice for each character, you're stuck in painful cycles. You manually record lines, upload them, and then write complex conditional logic to ensure Character A speaks at the right time, with the correct tone, and never exceeds its allowed dialogue length.

With this MCP, your agent manages all that complexity internally. You simply define the character voice—whether by cloning or describing it—and let the routing system handle the rest of the choreography. What you get is a living dialogue, not a script.

The `chat_completions` tool controls conversation flow.

Without this MCP's routing tools, your agent runs on one single logic path. If the user veers off-topic—saying something about history instead of combat—the basic model struggles to pivot gracefully or know which specialized knowledge base to reference.

By implementing `chat_completions` through an LLM Router, you teach the system how to think in stages. It checks for intent first, then routes the request to a specific module, ensuring the response is always context-aware and hits the right narrative beat.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

text-to-speech

voice-cloning

ai-characters

conversational-ai

speech-synthesis

What your AI can actually do with this

This MCP lets your AI client generate dynamic characters that speak and react in real time. You can create unique digital personas by cloning existing voices from simple audio files, or you can design entirely new voices just by describing them with text prompts. Beyond voice, the system manages complex character logic using routers; this lets you build conversation paths where the agent knows exactly how to behave based on context.

Need to process an incoming voice message? You can transcribe it directly into usable text for your agents. All of this is accessible through your preferred AI client on the Vinkius Marketplace. It’s built for scenarios that require more than just simple Q&A—it handles the entire spectrum, from initial speech synthesis to complex conversational branching.

Built · Hosted · Managed by Vinkius Inworld AI MCP - Voice Cloning & Character Routing

Server ID 019e5d27-3aed-7225-b886-3b43275d3091

Vinkius Inspector

Compliance Grade F

Score 3.11/100

Report View Report ↗

What Changes When You Connect

Real-time voice communication: Use create_realtime_call to enable immediate, two-way audio conversations with NPCs, making interactions feel natural.

Flexible character logic: Instead of hardcoding rules, use tools like create_router and chat_completions to let the agent dynamically decide how to respond based on complex context.

Voice cloning capability: You can clone a voice from any audio sample using clone_voice, eliminating the need for pre-recorded dialogue files. This is huge for content volume.

Advanced text input handling: Send an audio file and let your agent process it instantly by running the transcribe_audio tool, keeping the workflow seamless.

Full asset control: Manage every part of your character's voice library—from listing voices with list_voices to publishing them via publish_voice.

See it in action

01 01

The RPG Quest Giver

A game developer needs a quest giver NPC who speaks with a specific, deep-voiced accent. They use the design_voice tool to create a unique voice preview based on text prompts and then save it using publish_voice. The agent’s response is managed by setting up an LLM Router via create_router, ensuring that if the player asks about lore (a specific topic), the correct character logic executes.

02 02

The Live Streamer

A content creator needs to automate a video where characters interact. They send an audio clip of dialogue, which is processed by transcribe_audio and then fed into the system's chat functions. The agent responds using a cloned voice via synthesize_speech_sync, allowing them to generate hours of unique character interaction automatically.

03 03

The Customer Support Bot

An engineer needs an internal bot that can handle calls and transcribe speech for logging. They use the create_realtime_call tool to connect the agent via WebRTC, capturing user audio in real-time. The resulting text is then passed into a router configured with specific support protocols.

04 04

The Collaborative Agent Team

A team wants multiple agents to debate a topic and report their findings. They use the list_models tool to select the best LLM, set up dedicated roles using create_router, and then pass text inputs through chat_completions to simulate multi-agent deliberation.

The honest tradeoffs

Using only basic synthesis.

Anti-pattern

The agent sounds flat, robotic, and doesn't fit the character's personality. It just spits out text that reads like a Wikipedia article.

The Fix

Don't stick to simple calls; use clone_voice first to establish a specific voice identity. Then, manage conversational flow with create_router so the agent knows how to speak, not just what to say.

Ignoring audio input.

Anti-pattern

The user speaks into their microphone, but the agent only replies based on text prompts because the system never processed the spoken word.

The Fix

Always run incoming audio through transcribe_audio first. This converts speech to a format your router can read and act upon.

Over-relying on one model.

Anti-pattern

When the conversation gets complex, the single LLM fails because it doesn't know which specialized sub-routine (e.g., 'Lore Check' vs. 'Combat Response') to activate.

The Fix

Build a routing layer using create_router. This structure allows you to check for specific keywords or topics and route the request to the appropriate, dedicated character logic.

When It Fits, When It Doesn't

Use this MCP if your project requires genuine voice fidelity (cloning, streaming) and complex conversational branching. Specifically, if your agents need to handle audio input (transcribe_audio) or require dynamic role-switching between different logical pathways (create_router), this is the toolset you need. Don't use it if you just need a simple text summary (use a basic chat endpoint) or if all your characters speak in one consistent, non-variable voice (basic TTS might suffice). The power here is combining multimodal input with structured output routing.

Questions you might have

How do I make an agent respond using a voice that sounds like me? (using clone_voice) +

You use clone_voice by providing it with an audio sample of your speaking. The tool analyzes the phonetics and generates a new, unique voice ID that you can then reference when synthesizing speech.

What is the best way to handle user audio input? (using transcribe_audio) +

You send the raw audio file directly to transcribe_audio. The tool handles the conversion synchronously, delivering clean, text-based data that your agent can immediately process.

How do I manage multiple character personalities? (using create_router) +

You set up a dedicated LLM Router using create_router. This router acts as the central decision point, checking user input and directing the request to the appropriate logic path or personality module.

Can I change my character's voice mid-conversation? (using update_voice) +

Yes. If you need to adjust a published asset, use update_voice. This allows you to modify metadata or the underlying audio profile without having to delete and recreate the entire voice.

Before building agent logic, how do I know which LLM models are available by using the `list_models` tool? +

Just call list_models. This returns a current list of all foundational models that your agents can use. It's smart to run this first so you confirm exactly which model names will work with the router before building complex flows.

If I need real-time audio feedback, should I use `synthesize_speech_stream` or `synthesize_speech_sync`? +

You must use synthesize_speech_stream. This method delivers the audio data in chunks as it's generated. That’s essential for keeping interactions responsive and eliminating noticeable delays in your agent's speech output.

How do I clean up old or unused character logic using `delete_router`? +

You use the delete_router tool, providing only the router ID. This completely removes the defined conversational path and its associated context from your workspace. It's a critical cleanup step to keep your account tidy.

When my agent needs to generate text after routing through multiple steps, how does `chat_completions` finalize the reply? +

The chat_completions tool executes the final logic defined by the LLM Router. It takes the entire conversational context and generates the ultimate text response. Think of it as the final step in a complex workflow, giving you the agent's intended words.

How can I create a custom voice using only a text description? +

You can use the design_voice tool. Simply provide a prompt like 'Warm, friendly male voice' and a preview text. The tool will generate voice options that you can later publish to your library.

What is the difference between synchronous and streaming speech synthesis? +

Use synthesize_speech_sync to receive the full audio file once processing is complete. Use synthesize_speech_stream for real-time applications where you want to receive audio chunks as they are generated for lower latency.

Can I manage multiple AI characters or models through this server? +

Yes. You can use list_routers and get_router to manage your orchestration layers, and list_models to see available AI models in your Inworld workspace.

How this MCP server connects to your AI agent

What AI agents can do with Inworld AI Automation

Chat completions

Clone voice

Create realtime call

What AI agents can do with Inworld AI: 19 Tools

Chat Completions

Generates chat completions by running the request through a defined LLM Router.

Clone Voice

Creates a new voice profile by analyzing and replicating an existing audio sample.

Create Realtime Call

Sets up a WebRTC connection to enable real-time, bidirectional voice communication...

Create Router

Builds and initializes a new LLM Router that manages how the agent processes...

Delete Router

Removes an existing LLM Router from your workspace.

Delete Voice

Permanently deletes a voice profile you have created or cloned.

Design Voice

Generates a unique, temporary voice preview based solely on a written text description.

Get Router

Retrieves the specific details and configuration of an LLM Router by its ID.

Get Voice

Fetches all metadata for a single voice profile using its unique identifier.

List Models

Shows you all the available Large Language Models the agent can use for processing.

List Routers

Lists every LLM Router currently set up in your workspace.

List Tts Voices

A deprecated function to list Text-to-Speech voices; use 'list_voices' instead.

List Voices

Retrieves a full catalog of all voice assets currently available in your workspace.

Publish Voice

Takes a draft or preview voice and makes it a permanent, usable asset within your...

Synthesize Speech Stream

Generates speech audio in real-time chunks for streaming playback to the user.

Synthesize Speech Sync

Creates a complete, finished speech file from text that can be played back instantly.

Transcribe Audio

Converts an uploaded audio file into plain text format in a single synchronous call.

Update Router

Modifies the logic and parameters of an existing LLM Router to change its behavior.

Update Voice

Makes changes to an existing voice profile, such as updating metadata or publication status.

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Built on the Model Context Protocol (MCP) for Claude, ChatGPT, Cursor, and more