Dataiku DSS MCP. Control data pipelines and models via natural chat.
Works with every AI agent you already use
…and any MCP-compatible client
Just plug in your AI agents and start using Vinkius.
Dataiku DSS. Connects your Dataiku DSS instance directly to your AI client. You get full control over enterprise data science workflows—list projects, check dataset schemas, monitor model performance, and run pipeline jobs—all through natural conversation.
It's the API layer for your entire data science stack.
What your AI agents can do
Dataset schema
Gets the column names and data types for a specific dataset.
Get job
Retrieves the status, timing, and outputs of a specific pipeline job.
Get model
Gets the metadata, algorithm, and performance scores for a saved model.
You pass a dataset name and get a list of its columns and data types.
You check the current state, timing, and outputs of specific data pipeline jobs.
You retrieve the saved ML model's details, including the algorithm used and its performance metrics.
You get a list of accessible DSS projects and all connected data sources (APIs, databases).
You retrieve the exact configuration and settings for a specific data recipe (Python, SQL, or Visual).
You trigger specific automation scenarios, like rebuilding a pipeline or retraining a model.
Ask AI about this MCP
Supported MCP Clients
Waiting for input…
019d7582dataset schema
Gets the column names and data types for a specific dataset.
019d7582get job
Retrieves the status, timing, and outputs of a specific pipeline job.
019d7582get model
Gets the metadata, algorithm, and performance scores for a saved model.
019d7582get project
Retrieves metadata, settings, and tags for a specific project.
019d7582get recipe
Retrieves the configuration and settings for a data transformation recipe.
019d7582list connections
Lists all data sources connected to DSS, including databases and cloud storage.
019d7582list datasets
Lists every dataset available within a specific project.
019d7582list jobs
Lists all pipeline jobs within a project (builds or training runs).
019d7582list models
Lists all saved and deployed ML models in a project.
019d7582list plugins
Lists all installed extensions and plugins for the DSS platform.
019d7582list projects
Lists every DSS project accessible by your API key.
019d7582list recipes
Lists all data transformation recipes in a project.
019d7582list scenarios
Lists the defined automation scenarios in a project.
019d7582run scenario
Triggers a defined automation scenario, such as rebuilding a pipeline or retraining a model.
Choose How to Get Started
Build a custom MCP for your own tools, or connect a ready-made integration from our catalog.
Build Your Own
Turn any API into an MCP. Import a spec, define Agent Skills, or deploy with MCPFusion.
- Import from OpenAPI, Swagger, or YAML specs
- Create Agent Skills with progressive disclosure
- Deploy to edge with MCPFusion framework
- Built in DLP, auth, and compliance on every call
- Real time usage dashboard and cost metering
- Publish to catalog or keep private
Make Your AI Do More
Start with Dataiku DSS, then connect any of our 4,700+ other servers whenever your AI needs more. One click, no limits.
- Use this MCP plus 4,700+ others, all in one place
- Add new capabilities to your AI anytime you want
- Every connection is secured and compliant automatically
- Track usage and costs across all your servers
- Works with Claude, ChatGPT, Cursor, and more
- New servers added to the catalog every week
What you can do with this MCP connector
You're hooking your Dataiku DSS instance up to your AI client. This lets you run your whole data science workflow through conversation. You can list every project you have access to and check all the data sources connected to DSS, including databases and cloud storage.
Need to scope out a dataset? You pass a dataset name, and the agent gets you a list of its columns and data types. You can also list every dataset within a specific project. To check out the setup, you can list all the data transformation recipes in a project, or list all the installed extensions and plugins for the DSS platform.
When it comes to running stuff, you can list all the pipeline jobs in a project and get their status, timing, and outputs. You can list all the saved and deployed ML models in a project, and the agent will get you the metadata, algorithm, and performance scores for any specific model.
You can list all the defined automation scenarios in a project, and you'll trigger them with run_scenario.
For deep dives, you can get the exact configuration and settings for a specific data recipe using get_recipe. You can also list all the pipeline jobs using list_jobs, and you can trigger a defined automation scenario using run_scenario. You can get the metadata, algorithm, and performance scores for a specific saved model using get_model.
It's all about full control: you can list every DSS project with list_projects and check every dataset schema with dataset_schema.
How Dataiku DSS MCP Works
- 1 First, subscribe to the Dataiku DSS server and provide your Dataiku Instance URL and API Key (Personal, Project, or Global key).
- 2 Next, you ask your AI client to perform an action, like listing projects or checking a schema.
- 3 The server runs the specific tool, returns the structured data, and your AI client interprets it for you.
The bottom line is, you manage your entire data science stack using natural conversation, without ever leaving your AI agent.
Who Is Dataiku DSS MCP For?
The data scientist who needs to check a schema or run a model without opening the Dataiku UI. The data engineer tracking pipeline failures or verifying recipe logic. The MLOps team that needs to trigger a model retraining run on demand. Or the analytics manager who needs a quick inventory of all connected data sources.
Checks dataset schemas and monitors model training runs to stay in the research flow.
Tracks pipeline jobs and verifies recipe configurations using natural language queries.
Triggers automation scenarios and monitors deployed models in real-time for production issues.
Audits project metadata and verifies data connections across the whole organization.
What Changes When You Connect
- See model performance metrics instantly. Instead of navigating to the model tab, you just ask for it using
get_model. You get the algorithm and the current performance scores right in your chat window. - Verify data lineage logic. Need to know if a recipe changed? Run
get_recipeto pull the exact configuration structure for Python, SQL, or Visual recipes. No more guessing about data flow. - Manage job runs without leaving your agent. Use
list_jobsto see if the build tasks finished, orget_jobto check the exact timing and output of a specific pipeline run. - Audit your environment quickly. Run
list_connectionsto get an immediate inventory of all connected databases and APIs. It's your single source of truth for data sources. - Automate complex actions. Don't manually rebuild pipelines. Use
list_scenariosto find the right automation and thenrun_scenarioto trigger the build or retraining instantly. - Explore the entire data landscape. Use
list_projectsandlist_datasetstogether to map out every project and dataset in your DSS environment.
Real-World Use Cases
Checking a Dataset's Structure
A data scientist needs to confirm if the 'user_email' field is still a string type before starting a new model. Instead of opening the dataset in Dataiku and clicking through tabs, they ask their agent: 'What is the schema for the 'raw_logs' dataset?' The agent calls dataset_schema and returns the exact column types, letting the scientist validate the data structure instantly.
Troubleshooting a Failed Pipeline Run
The MLOps engineer sees a job failure. Instead of logging into the dashboard and clicking through status pages, they tell their agent: 'Check the job status for the 'Fraud-Detection-Live' pipeline.' The agent runs get_job, providing the status, timing, and failure points in a single response, letting the engineer fix it faster.
Retraining a Model on Demand
The data team decides to retrain the 'Sales-Forecasting' model with new data. Instead of manually following a deployment checklist, they ask the agent to 'Retrain the sales model now.' The agent uses list_scenarios to find the right automation, then calls run_scenario, triggering the build and retraining process immediately.
Inventorying Data Assets
A new analytics manager joins the team and needs to know every project and data source. They ask the agent to 'List all projects and connections.' The agent runs list_projects and list_connections, giving the manager a comprehensive, immediate overview of the entire DSS environment.
The Tradeoffs
Manual UI Navigation
A user manually clicks through the Dataiku web UI: Project > Dataset > Schema Tab. This takes multiple clicks and context switching, and it's impossible to log the exact sequence of views.
→
Just ask your agent: 'What is the schema for dataset X?' The agent uses dataset_schema to pull the exact column structure instantly. This avoids the clicks and gets you the data you need immediately.
Guessing Pipeline State
A user checks the dashboard and sees 'Running...' but doesn't know if it's stuck or progressing. They have to wait or manually check multiple status endpoints, wasting time.
→
Ask your agent to check the job status. The agent runs get_job to provide the precise status, expected completion time, and any output messages, confirming if the pipeline is actually working.
Verifying Recipe Logic by Reading Code
A developer has to open a recipe, scroll through potentially hundreds of lines of Python or SQL, and visually verify a small configuration change, which is error-prone and slow.
→
Use get_recipe. This tool pulls the explicit configuration structure for the recipe, letting you verify the data logic in a clean, structured output, regardless of how complex the underlying code is.
When It Fits, When It Doesn't
Use this if you need to interact with Dataiku DSS's backend functions (schemas, jobs, models, recipes) without opening the browser. This is for engineers who need to script, audit, or programmatically interact with the data science platform's state. Don't use it if you are just trying to visualize data in a chart; for that, you still need the Dataiku UI. If your goal is simple data ingestion from an external source, you might use a general data integration tool instead of relying on list_connections alone.
Independent Platform Disclaimer: Vinkius is an independent platform and is not affiliated with, endorsed by, sponsored by, verified by, or otherwise authorized by Dataiku. All third-party trademarks, logos, and brand names are the property of their respective owners. Their use on this website is strictly for informational purposes to identify service compatibility and interoperability.
VINKIUS INFRASTRUCTURE
Cloud Hosted
Managed infra
V8 Isolated
Sandboxed per request
Zero-Trust Proxy
No stored credentials
DLP Enforced
Policy on every call
GDPR Compliant
EU data residency
Token Compression
~60% cost reduction
Works with Claude, ChatGPT, Cursor, and more
The Model Context Protocol standardizes how applications expose capabilities to LLMs. Instead of operating in isolation, your AI gains direct access to external platforms, live data, and real-world actions through secure, standardized connections.
This server provides 14 capabilities that interface natively with Claude, ChatGPT, Cursor, and any MCP client. No middleware. No custom integration required.
Available Capabilities
Sifting through Dataiku tabs to find one simple schema detail.
Today, checking a dataset's schema means navigating the Dataiku UI. You click into the project, select the dataset, then look for the 'Schema' tab. If you're on a large project, it's easy to get lost in the menu structure, and you waste time clicking around just to confirm a column type.
With this MCP server, you just talk to your agent. You ask, 'What is the schema for dataset X?' and the agent uses `dataset_schema` to pull the exact column structure immediately. You get the data, not a confusing menu.
Dataiku DSS MCP Server: Model Monitoring
Before, checking model performance required jumping between the 'Model' tab and the 'Metrics' dashboard, often needing to filter by date range or specific metric. It was a multi-step process that always felt manual and disconnected.
Now, you just ask your agent to check the model. The agent runs `get_model`, giving you the performance metrics and algorithm details right in the chat. It’s one step, zero clicks, and it's always current.
Common Questions About Dataiku DSS MCP
How do I use the `dataset_schema` tool? +
The dataset_schema tool requires you to specify the target dataset name. It then returns a list of column names and their corresponding data types for that dataset.
Can I use `run_scenario` to rebuild my pipeline? +
Yes. The run_scenario tool executes predefined automation flows. You must first use list_scenarios to find the exact scenario name (e.g., 'REBUILD_PIPELINE') before triggering the run.
What is the difference between `list_jobs` and `get_job`? +
list_jobs gives you a list of all jobs (builds, training runs) in a project. get_job requires a specific job ID and gives you the detailed status, timing, and outputs for just that one job.
Does `list_connections` list all data types? +
No. list_connections lists the established data sources (databases, cloud storage, APIs) that Dataiku can access. It doesn't list the data types within those connections.
How do I check the settings for a recipe using `get_recipe`? +
You must provide the specific recipe name and the project ID. The tool then extracts and returns the full configuration structure, letting you audit the data logic.
What information does `list_connections` provide about data sources? +
It lists every data connection configured in your DSS instance. You'll see the connection type (SQL, Cloud Storage, API) and the name assigned to it. This helps you audit which external sources your projects rely on.
How do I use `list_models` to check model performance? +
The list_models tool provides saved ML model metadata. For performance, you need to use get_model, which retrieves detailed metrics, the algorithm used, and the specific schema layers of the trained model.
Can I check the metadata for a project using `get_project`? +
Yes, get_project retrieves the project's metadata. This includes general settings, tags, and other configurations. You can use this to understand the project's scope without opening the UI.
Can my agent trigger a Dataiku automation scenario? +
Yes. Use the 'run_scenario' tool. Provide the project key and the scenario ID. The agent will command the backend to orchestrate the absolute workflow rules, triggering a new execution run for your pipeline or model retraining.
How do I check the schema of a specific dataset via chat? +
Provide the project key and dataset name to the 'dataset_schema' tool. Your agent will validate the API arrays structurally and return the dataset column names and types natively, helping you understand your data boundaries.
Can I monitor the performance of saved ML models? +
Absolutely. Use the 'get_model' tool. Your agent retrieves the metadata and performance metrics defining specific trained schema layers, allowing you to audit model quality and drift without opening the DSS UI.
Use it with your favorite AI tools
Connect this server to Cursor, Claude, VS Code, and more.
More in this category
Cognee
Build knowledge graphs from unstructured data — ingest text, extract entities and relationships, and search with graph-aware AI reasoning.
Anthropic
Interact with Claude models via the Anthropic Messages API — send prompts, manage batches, and monitor rate limits directly.
HrFlow.ai
AI-powered talent acquisition API for parsing, matching, and reasoning.
You might also like
Craft CMS (Craftnet)
Equip your AI agent to manage plugins, licenses, and sales directly via the Craftnet (Craft CMS) API.
Retable
Organize data in smart spreadsheets with relational views, team collaboration, and workflow automation that goes beyond basic tables.
MLB Stats
Access real-time MLB data, player stats, game schedules, and live feeds directly from your AI agent.