IBM watsonx MCP for AI. Control AI Model Operations and Tuning.

Q: How do I know what models are available using listmodels?

listmodels returns all foundation model IDs and their capabilities; this tells your agent exactly which versions it can run against.

Q: Is tuning a model difficult? Can I check the status using gettuningstatus?

No; you initiate the job with startmodeltuning, and then your agent monitors its progress by calling gettuningstatus. This keeps the whole process visible.

Q: Can I save my prompts using createprompt?

Yes. Calling createprompt saves a new template into the watsonx project, so you don't have to rewrite the exact prompt structure every single time.

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

See Vinkius in Action

Works with every AI agent you already use

…and any MCP-compatible client

Connect to your AI in seconds.

IBM watsonx provides a connection to an enterprise-grade suite of AI models for running complex data operations. Use this MCP to generate text, create vector embeddings for semantic search, manage model lifecycle details, and conduct advanced prompt tuning jobs directly from your agent.

What your AI can do

Create prompt

Allows your agent to save and organize a new prompt template within watsonx for later use.

Generate chat

Generates chat completions, making it ideal for building multi-turn conversations with the AI model.

Generate embeddings

Creates numerical vector embeddings from input text, which is necessary for semantic search and clustering tasks.

+ 7 more capabilities included

Run Multi-Turn Conversations

Execute complex, ongoing chat applications by generating completions using a watsonx chat model.

Prepare Data for Search

Generate vector embeddings from text inputs. This process is necessary for semantic analysis and finding related data points in large knowledge bases.

Execute Text Generation Tasks

Create single-turn content, such as summarizing documents or writing initial drafts, using a watsonx foundation model.

Manage Model Definitions

List available foundation models, checking their IDs, capabilities, and current lifecycle status to select the right resource for a job.

Initiate Prompt Tuning

Start model tuning jobs using training data from cloud storage, refining a foundation model's behavior on specific tasks.

Ask an AI about this

IBM watsonx: 10 Available Operations

Use these ten tools to programmatically interact with IBM's AI ecosystem. You can list available resources, generate content, or run complex model tuning jobs.

Make your AI actually useful.

Add this MCP to Claude, Cursor, or Windsurf and your AI stops guessing. It gets real tools to look things up, take action, and handle the stuff you keep doing by hand.

Start using IBM watsonx on Vinkius

Create Prompt

Allows your agent to save and organize a new prompt template within watsonx for later use.

Generate Chat

Generates chat completions, making it ideal for building multi-turn conversations...

Generate Embeddings

Creates numerical vector embeddings from input text, which is necessary for semantic...

Generate Text

Generates standard text content for single-turn jobs like summarization or drafting...

Get Model Details

Retrieves specific technical specifications and metadata for a foundation model you...

Get Tuning Status

Checks the current progress or status of an ongoing prompt tuning job.

List Models

Queries and provides a list of all available foundation models in your watsonx environment, including their IDs and capabilities.

List Projects

Lists the different project containers you have set up within your watsonx account.

List Prompts

Retrieves a list of all saved prompts associated with a specific watsonx project for...

Start Model Tuning

Initiates the process of fine-tuning a foundation model by pointing it to an...

Security and governance baked right in.

Pick your AI client below to get set up. Just create a Vinkius account, subscribe, and you're instantly up and running. We handle the entire backend infrastructure, delivering out-of-the-box support for HTTPS Streamable, SSE, and OAuth2—zero messy routing required.

Claude AI

Open Claude Settings

Go to claude.ai, click your profile icon, then navigate to Customize → Connectors.

Add Custom Connector

Click the "+" button and select Add custom connector. Paste your Vinkius endpoint URL:

https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp

Replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com. For OAuth-protected servers, expand Advanced settings to add credentials.

Start a conversation

Open a new chat. The IBM watsonx integration is available immediately — no restart needed.

Antigravity

Configure Agent Environment

Open your Antigravity agent's workspace configuration or mcp-servers.json file.

Bind the Endpoint

Add the Vinkius endpoint URL to your agent's MCP connections list:

"mcp_servers": {
  "ibm-watsonx": {
    "serverUrl": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
  }
}

Provide your secure token in place of [YOUR_TOKEN_HERE] to ensure your agent requests are authenticated.

Execute

Start your Antigravity session. The agent will autonomously discover and utilize the IBM watsonx tools with full Vinkius guardrails applied.

VS Code Copilot

⚡

One-Click Install (Recommended)

In your Vinkius Dashboard, simply click the Add to VS Code button for this server. We'll automatically configure your local workspace.

Or configure manually

Open MCP Settings

Open VS Code, press Ctrl/Cmd + Shift + P, and search for GitHub Copilot: MCP Servers.

Add Server Config

Add the Vinkius endpoint configuration to your mcp-servers.json file:

"ibm-watsonx": {
  "url": "https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp"
}

Ensure you replace [YOUR_TOKEN_HERE] with your token from cloud.vinkius.com.

LangChain

Install Dependencies

Install the LangChain MCP adapters for your environment:

pip install langchain-mcp-adapters

Connect the Server

Use the SSEClient in LangChain to connect to the Vinkius managed endpoint:

from langchain_mcp_adapters.client import SSEClient

# Connect to Vinkius
client = SSEClient(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")
tools = client.get_tools()

CrewAI

Define the Tool

Load the Vinkius MCP tools into your CrewAI agents:

from crewai import Agent
from mcp_crewai import MCPTool

# Connect securely to Vinkius
vinkius_tools = MCPTool(url="https://edge.vinkius.com/[YOUR_TOKEN_HERE]/mcp")

# Assign to Agent
researcher = Agent(
    role='Data Researcher',
    tools=vinkius_tools.get_all()
)

Execute Task

Run your CrewAI process. The agent will autonomously route tasks to the Vinkius managed server.

Choose How to Get Started

Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.

Build Your Own

Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.

Import from OpenAPI, Swagger, or YAML specs
Create Agent Skills with progressive disclosure
Deploy to edge with MCPFusion framework
Built in DLP, auth, and compliance on every call
Real time usage dashboard and cost metering
Publish to catalog or keep private

Start building

Make Your AI Do More

Start with IBM watsonx, then connect any of our 5,100+ other servers whenever your AI needs more. One click, no limits.

Use this MCP plus 5,100+ others, all in one place
Add new capabilities to your AI anytime you want
Every connection is secured and compliant automatically
Track usage and costs across all your servers
Works with Claude, ChatGPT, Cursor, and more
New servers added to the catalog every week

Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by IBM watsonx. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.

VINKIUS INFRASTRUCTURE

Cloud Hosted

Managed infra

V8 Isolated

Sandboxed per request

Zero-Trust Proxy

No stored credentials

DLP Enforced

Policy on every call

GDPR Compliant

EU data residency

Token Compression

~60% cost reduction

Your data is protected. See how we built it.

Works with Claude, ChatGPT, Cursor, and more

The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.

This connection provides 10 powerful capabilities that interface natively with Claude, ChatGPT, Cursor, and other compatible AI platforms. No middleware. No custom integration required.

The problem today is manually tracking which model to use.

Right now, if your team needs a summary of documents or wants to run a chat session, they usually have to switch between different portals and API documentation pages. They're constantly checking if the foundation model supports vector inputs; are they using Model A for generation but Model B for embeddings? It’s tedious copy-pasting and manual verification.

With this MCP connection, that guesswork disappears. Your agent handles all of it. You can reliably list every available resource by calling `list_models`, giving you full visibility into the system's capacity before you even write the first line of code.

The `generate_embeddings` tool makes data searchable.

Before, semantic search was a huge pain. You had to manually chunk documents and use separate tools for indexing, which meant copy-pasting the text into one system and then retrieving it in another. The process wasn't connected; you were doing half the work yourself.

Now, generating embeddings is just one step: call `generate_embeddings`. You get the vector output directly from the MCP, allowing your agent to plug those numbers straight into a database for instant, accurate semantic retrieval.

Support 24/7 support@vinkius.com ↗

Security Vinkius Trust Center ↗

SLA Service Level Agreement ↗

Report Listing Send Report ↗

What your AI can actually do with this

You need more than just a simple chat interface; you're dealing with production-level AI work. This connection lets your agent interact with the full power of IBM watsonx, handling everything from basic text generation to deep model management. You can manage prompts, list available foundation models, and get detailed specs for any particular model version.

It’s built for engineers who need control over their data pipeline; you can generate vector embeddings for similarity searches or run multi-turn chat completions that require state tracking. When working with these complex systems, Vinkius provides the centralized platform, letting you connect your preferred AI client to this entire catalog of operations.

It means your agent doesn't just talk to an API; it manages the model itself—it can initiate tuning jobs or check the status of existing ones. It’s about making sure the output isn't just generated, but that it meets specific structural requirements.

Built · Hosted · Managed by Vinkius IBM watsonx MCP - Model Generation and Tuning Tools

Server ID 019d75b7-4300-72c7-865d-c5f3402cbd20

Vinkius Inspector

Compliance Grade A+

Score 95.83/100

Report View Report ↗

Here's how it actually works

The bottom line is: you get direct programmatic access to the full spectrum of watsonx's operational tools, making model interaction predictable and repeatable.

Tell your agent which foundation models you need to interact with by calling list_models to see available IDs and capabilities.

For content creation, specify the text input and desired output using generate_text, or for conversational flow, use generate_chat.

When data needs searching against a corpus, first generate vector embeddings via generate_embeddings, then feed those vectors into your application logic.

Who is this actually for?

Data Engineers and Machine Learning Scientists. If your job involves building production-grade AI applications that require careful monitoring of model performance or handling complex data pipelines, this MCP is for you. It addresses the pain point of manually managing prompts, tuning models, and ensuring stable output schemas across different environments.

ML Engineer

They build agents that call generate_embeddings to index data, then use those embeddings in retrieval-augmented generation (RAG) pipelines.

Data Scientist

They run list_models first to choose the appropriate model for a task; they then initiate tuning jobs using start_model_tuning to customize performance.

AI Architect

They need reliable ways to check system health, so they monitor the status of existing tuning jobs by calling get_tuning_status.

What Changes When You Connect

You eliminate guesswork about available models. By using list_models, your agent gets a definitive list of foundation model IDs, ensuring you always select the correct resource for the job.

Complex interactions no longer fail on state. The ability to use generate_chat handles multi-turn conversational contexts automatically, maintaining dialogue history across multiple calls.

Search becomes semantic, not keyword-based. Calling generate_embeddings transforms simple text into vectors, enabling true similarity search that finds contextually related documents.

Tuning is manageable, not a black box. You can initiate advanced training using start_model_tuning and then track progress via get_tuning_status, keeping your model performance predictable.

Model selection is streamlined. Instead of guessing which API endpoint to use, you first check the specs with get_model_details to guarantee the model meets your required output schema.

See it in action

01 01

Building a Custom Q&A Bot

An agent needs to build an internal knowledge bot. First, it runs generate_embeddings on all corporate PDFs; this creates the vector index. Then, when a user asks a question, the agent uses those embeddings to find relevant source chunks and passes them into generate_chat for a grounded answer.

02 02

Automating Content Pipelines

A marketing team needs weekly blog summaries. The agent calls list_prompts to retrieve the standard summary template, then uses generate_text with the raw article content to produce a polished draft.

03 03

Model Performance Validation

Before deployment, an ML engineer needs to confirm if a model can handle structured data. They call get_model_details to validate the capabilities and then use list_models to check which version is stable enough for testing.

04 04

Fine-Tuning on Proprietary Data

A financial services firm has specialized terminology. They must call start_model_tuning, pointing it to a secure cloud bucket of historical reports, and then monitor the progress using get_tuning_status until the model is ready.

The honest tradeoffs

Relying on single API calls

Anti-pattern

Assuming that a basic text generation call will be sufficient for complex, multi-step reasoning tasks.

The Fix

If the task requires conversational memory or structured output, you must use generate_chat or first check model capability using get_model_details. Don't rely on single calls for stateful logic.

Skipping data preparation

Anti-pattern

Trying to perform a similarity search by just passing raw text strings into the AI endpoint.

The Fix

You must first convert your source documents into numerical space using generate_embeddings. The vector output is what drives true semantic comparison.

Overlooking model versions

Anti-pattern

Attempting to run a job with an outdated or unsupported model ID, leading to runtime errors.

The Fix

Always start by running list_models and cross-referencing the required capabilities. This ensures you're targeting a known good state.

When It Fits, When It Doesn't

Use this MCP if your workflow requires rigorous control over model behavior, including tuning and explicit resource management. You need to know why a model failed or what its exact specs are; that’s where the value is. Don't use this if you simply want casual brainstorming—for that, a simple chat client works fine. But if your application must scale past basic text generation, remember you can list projects and models via list_projects and list_models; this gives you the governance layer required for enterprise reliability.

Questions you might have

How do I know what models are available using `list_models`? +

list_models returns all foundation model IDs and their capabilities; this tells your agent exactly which versions it can run against.

What is the difference between `generate_text` and `generate_chat`? +

'Generate text' handles single, standalone tasks like summarization. 'Generate chat' manages conversation history, making it suitable for multi-turn dialogue where context matters.

Is tuning a model difficult? Can I check the status using `get_tuning_status`? +

No; you initiate the job with start_model_tuning, and then your agent monitors its progress by calling get_tuning_status. This keeps the whole process visible.

Can I save my prompts using `create_prompt`? +

Yes. Calling create_prompt saves a new template into the watsonx project, so you don't have to rewrite the exact prompt structure every single time.

How do I use `generate_embeddings` for similarity search or clustering? +

It creates vector embeddings from your input text. You take these vectors and run them against a database to find texts that are semantically similar, even if the words aren't identical.

What information can I get about a specific model using `get_model_details`? +

This tool provides detailed specifications for any foundation model. You check it to confirm things like its supported capabilities, required inputs, and optimal use cases before writing code.

What is the purpose of running `list_projects`? +

It displays all the distinct watsonx projects within your account. You run this command first to confirm the correct operational scope for any model management or data task you intend to perform.

What prerequisites are needed when calling `start_model_tuning`? +

You must provide a cloud storage URL pointing directly to your training data. The tuning job cannot begin until the foundation model can access and read the content at that specific link.

Connect to your AI in seconds.

Create prompt

Generate chat

Generate embeddings

IBM watsonx: 10 Available Operations

Make your AI actually useful.

Create Prompt

Generate Chat

Generate Embeddings

Generate Text

Get Model Details

Get Tuning Status

List Models

List Projects

List Prompts

Start Model Tuning

Security and governance baked right in.

Claude AI

Open Claude Settings

Add Custom Connector

Start a conversation

Claude Code

Open your terminal

Add the MCP Server

Start coding

Cursor

One-Click Install (Recommended)

Open Cursor Settings

Add New Server

Use in Composer

Antigravity

Configure Agent Environment

Bind the Endpoint

Execute

VS Code Copilot

One-Click Install (Recommended)

Open MCP Settings

Add Server Config

Windsurf

One-Click Install (Recommended)

Open Windsurf Settings

Add Server Endpoint

LangChain

Install Dependencies

Connect the Server

CrewAI

Define the Tool

Execute Task

Choose How to Get Started

Build Your Own

Make Your AI Do More

Works with Claude, ChatGPT, Cursor, and more

The problem today is manually tracking which model to use.

The `generate_embeddings` tool makes data searchable.

What your AI can actually do with this

Here's how it actually works

Who is this actually for?

What Changes When You Connect

See it in action

Building a Custom Q&A Bot

Automating Content Pipelines

Model Performance Validation

Fine-Tuning on Proprietary Data

The honest tradeoffs

Relying on single API calls

Skipping data preparation

Overlooking model versions

When It Fits, When It Doesn't

Questions you might have