Vinkius
Replicate

Replicate MCP. Run ML Models, From Search to Output.

Claude Claude
ChatGPT ChatGPT
Cursor Cursor
Gemini Gemini
Windsurf Windsurf
VS Code VS Code
JetBrains JetBrains
Vercel Vercel
See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Replicate MCP on Cursor AI Code Editor MCP Client Replicate MCP on Claude Desktop App MCP Integration Replicate MCP on OpenAI Agents SDK MCP Compatible Replicate MCP on Visual Studio Code MCP Extension Client Replicate MCP on GitHub Copilot AI Agent MCP Integration Replicate MCP on Google Gemini AI MCP Integration Replicate MCP on Lovable AI Development MCP Client Replicate MCP on Mistral AI Agents MCP Compatible Replicate MCP on Amazon AWS Bedrock MCP Support

Just plug in your AI agents and start using Vinkius.

Replicate MCP Server connects your AI client directly to thousands of open-source machine learning models. It lets you search for, execute, and monitor complex ML predictions (like image generation or specialized LLMs) using simple text commands—all without running the code on your local hardware.

What your AI agents can do

Cancel prediction

Stops a model prediction that is currently running on Replicate by its unique ID.

Create prediction

Starts a new model run, requiring the model version ID and all necessary input variables as JSON.

Get account

Retrieves basic details about your authenticated Replicate account for verification.

+ 9 more capabilities included
Run Model Predictions

Starts a new model prediction by sending the required inputs and version ID to Replicate.

Monitor Prediction Status

Retrieves the current status, output, or final result of any given prediction run.

Stop Running Processes

Immediately halts and cancels a prediction that is currently running on Replicate.

Search for Models by Use Case

Scans the public catalog to find models that match a specific search query or category.

List Available Model Groups

Retrieves curated collections of related models, like 'Image-to-Text' or 'Audio Generation'.

Get Model Metadata

Pulls the full details and required parameter schema for a specific model ID.

Supported MCP Clients

OAuth 2.0 Compatible
Vinkius runs on Claude Claude
Vinkius runs on ChatGPT ChatGPT
Vinkius runs on Cursor Cursor
Vinkius runs on Gemini Gemini
Vinkius runs on VS Code VS Code
Vinkius runs on JetBrains JetBrains
Vinkius runs on Vercel Vercel
Vinkius runs on Zendesk Zendesk
+ other MCP clients
Included with Plan

Waiting for input…

AI Agent

Replicate MCP Server: 12 Tools for ML Model Management

These tools let your agent manage every stage of the ML lifecycle—from searching model catalogs to running complex video and image predictions.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using Replicate on Vinkius
cancel019d75fe

cancel prediction

Stops a model prediction that is currently running on Replicate by its unique ID.

create019d75fe

create prediction

Starts a new model run, requiring the model version ID and all necessary input variables as JSON.

get019d75fe

get account

Retrieves basic details about your authenticated Replicate account for verification.

get019d75fe

get collection

Fetches a specific group of models using its unique collection slug (e.g., 'text-to-image').

get019d75fe

get model

Retrieves all details, including the required input schema, for one specific model.

get019d75fe

get prediction

Checks and retrieves the current status or final output of a previously started prediction run.

list019d75fe

list collections

Lists all curated model collections available on Replicate, like 'Image-to-Text'.

list019d75fe

list deployments

Shows a list of your active, deployed models and their status within Replicate.

list019d75fe

list hardware

Lists the GPU hardware options currently available for running model inferences on Replicate.

list019d75fe

list models

Provides a list of all public models that are generally available on the Replicate platform.

list019d75fe

list predictions

Displays a log of your recent prediction history, including status and output links.

search019d75fe

search models

Searches the public model catalog using keywords to find relevant open-source algorithms.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

  • Import from OpenAPI, Swagger, or YAML specs
  • Create Agent Skills with progressive disclosure
  • Deploy to edge with MCPFusion framework
  • Built in DLP, auth, and compliance on every call
  • Real time usage dashboard and cost metering
  • Publish to catalog or keep private
Start building

Make Your AI Do More

Start with Replicate, then connect any of our 4,800+ other servers whenever your AI needs more. One click, no limits.

  • Use this MCP plus 4,800+ others, all in one place
  • Add new capabilities to your AI anytime you want
  • Every connection is secured and compliant automatically
  • Track usage and costs across all your servers
  • Works with Claude, ChatGPT, Cursor, and more
  • New servers added to the catalog every week
Replicate MCP server cover

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Replicate API. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This server provides 12 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.

Setting up local model inference used to take a day of config files and dependency hell.

Before this server, getting a new ML task running meant downloading Python environments, installing CUDA drivers, and managing complex dependencies. It was boilerplate setup that stole time from actual development—time you should spend building features, not fighting virtual machines.

Now, your agent talks to Replicate via the MCP Server. You tell it what you need ('Generate a video of a robot dancing'). The server handles all the back-end plumbing and cloud compute resources. You just get the output.

Replicate MCP Server: Run complex ML jobs from your chat.

You no longer have to switch context between a coding IDE, a model documentation website, and a separate cloud console. You ask the agent in one window, and it orchestrates the search (`search_models`), validates parameters (`get_model`), and executes the job (`create_prediction`).

The result is pure flow state. The entire complex ML lifecycle—from discovery to execution—is condensed into a single, conversational command.

What you can do with this MCP connector

Replicate MCP Server

Connect your AI client directly to Replicate for thousands of open-source machine learning models. You don't need to set up local environments or manage GPU resources yourself; your agent handles it all on the backend. It lets you use complex ML predictions—like image generation or specialized LLMs—just by sending simple text commands.

Running and Monitoring Predictions

The core function is running model predictions. You call create_prediction when you want to start a new run; this requires you to supply the exact model version ID and all necessary input variables in JSON format. To keep tabs on what's happening, use get_prediction to check the current status or grab the final output of any prediction you started earlier.

If a process runs wild or you change your mind, you can immediately halt it using cancel_prediction, which kills a running model job by its unique ID.

Finding and Inspecting Models

If you need to find a model for a specific task, use search_models to scan the public catalog with keywords. If you know the general category, try list_collections to see curated groups of models—for example, 'Image-to-Text' or 'Audio Generation.' You can also pull a list of every available public model using list_models.

When your client finds a promising candidate model ID, it needs its specific requirements; run get_model to retrieve all the details, including the exact input schema and parameter rules for that single model. To see what's currently running or deployed within your organization, you can check out list_deployments, which shows your active models and their status.

Advanced Discovery and System Checks

For a deeper dive into available tools, use get_collection by providing a specific collection slug to fetch all the related models in that group. You can also see what GPU hardware options are available for running inferences on Replicate using list_hardware. To keep track of past work, list_predictions displays your full log of recent prediction history, including status updates and links to outputs.

If you need basic verification of your access, run get_account; this tool retrieves essential details about your authenticated Replicate account.

Putting It All Together

Your AI client can build a whole workflow using these tools. You start by running search_models for 'text-to-image,' then use get_model on the best result to confirm the required JSON structure, and finally execute create_prediction. If you want to make sure everything is working right before calling it, you can check your active deployments with list_deployments or see what collections are out there using list_collections.

This server gives your agent direct control over a massive library of open-source algorithms without needing any local setup. It's all about sending the right commands to get results.

Built · Hosted · Managed by Vinkius Replicate MCP Server - Run Open-Source ML Models Server ID 019d75fe-9426-7272-9964-c32556c42621
Vinkius Inspector
Compliance Grade A+
Score 100/100
Vinkius Inspector Badge — Score 100/100

Common Questions About Replicate MCP

How do I find out what models are available using search_models? +

You simply ask your agent to 'Search for image generation models.' The server runs search_models and returns a list of potential model IDs you can use later.

Can I check the status of a running job using get_prediction? +

Yes. If you have an ID for a prediction, calling get_prediction tells you if it's 'Running,' 'Finished,' or 'Failed,' along with the output if it succeeded.

What is the difference between list_models and search_models? +

list_models shows a general roster of all public models. search_models lets you filter that roster by specific keywords or use cases, which is usually more direct.

If my prediction fails, how do I cancel_prediction? +

You must provide the unique ID of the job that failed. The agent runs cancel_prediction on that ID to ensure no lingering charges or processes remain open.

Before running a model, how do I verify my API credentials using the `get_account` tool? +

The get_account tool pulls your authenticated Replicate account details directly. This confirms that your AI client has access to your billing and usage limits before you start generating expensive predictions.

When using `create_prediction`, what format must the input variables be in? +

You must supply model parameters as a strict JSON object. The system requires key-value pairs that exactly match the schema defined by the specific model version ID you are calling.

How does `list_collections` differ from simply listing all public models using `list_models`? +

list_collections returns curated groups of related models (e.g., 'Audio Generation'). This helps you browse by a specific domain or use case, rather than sifting through every single model available.

If I'm planning for high-volume processing, how can I check the available GPU resources using `list_hardware`? +

list_hardware shows you the current pool of deployable hardware options. Use this to gauge capacity and select the most efficient compute resource before running a prediction.

Can the agent pass a JSON payload directly into a Replicate model? +

Yes. You can utilize the create_prediction action and attach the payload parameter filled out with any required input schema (e.g., specific prompt, num_inference_steps). Since models change inputs constantly, you should always ask your assistant to fetch the schema details first via get_model to verify keys.

Does the prediction command return results instantly? +

No, Replicate's API operates asynchronously. The initial command gives your assistant an ID. You must then ask your AI companion to query the get_prediction tool periodically using that generated ID until it displays the completed status along with the generated web URLs or generated strings.

Can the AI browse trending or curated model collections? +

Yes. Use the list_collections tool to browse curated groups of models organized by category — such as image generation, text-to-speech, or video. Each collection includes a slug and description so you can quickly identify the right set of models for your use case.

Built & Managed by Vinkius 30s setup 12 tools

We've already built the connector for Replicate. Just plug in your AI agents and start using Vinkius.

No hosting. No infrastructure. No complex setup.
All 12 tools are live and waiting. You're up and running in seconds.

Vinkius runs on Claude Claude
Vinkius runs on ChatGPT ChatGPT
Vinkius runs on Cursor Cursor
Vinkius runs on Gemini Gemini
Vinkius runs on Windsurf Windsurf
Vinkius runs on VS Code VS Code
Vinkius runs on JetBrains JetBrains
Vinkius runs on Vercel Vercel
+ other MCP clients

Vinkius gives your AI agents access to the full catalog of app connectors, all fully managed, secure, and enterprise-ready. One subscription, every tool you need.

Zero hosting required Full MCP catalog included Enterprise-grade security Auto-updated by Vinkius

Built, hosted, and secured by Vinkius. You just connect and go.