# Modelbit MCP

> get_inference calls any deployed Modelbit machine learning model directly from your AI agent. You pass structured data—like complex JSON arrays or specific parameters—and immediately receive computed predictions. It eliminates the need to build custom wrapper code just to test proprietary ML logic inside an LLM chat.

## Overview
- **Category:** developer-tools
- **Price:** Free
- **Tags:** machine-learning, mlops, inference, model-deployment, python-models

## Description

When you use `get_inference`, your AI client calls any deployed Modelbit machine learning model right from your agent. You pass structured data—whether it's a complex array of pixels or specific parameters—and immediately get computed predictions back in a clean, usable format. This tool lets you run proprietary ML logic inside an LLM chat without having to write custom wrapper code just for testing.

This server executes production-grade models built using diverse frameworks. You don't care if your model was written in PyTorch, Scikit-learn, or plain Python; the `get_inference` tool handles running it all through a single call. This means you can test out sophisticated data science concepts directly within your chat flow, treating the ML model like just another function available to your agent.

Passing structured data is key here. You're not sending vague text prompts; you're giving the model exact inputs. You can send complex JSON objects or entire arrays of values, and the tool processes that structure directly. This capability means if your workflow requires analyzing a specific set of coordinates or processing multiple related data points simultaneously, the agent handles it by feeding those structured payloads straight into the deployed model.

The system guarantees reproducibility through version control. When you call `get_inference`, you specify exact model versions or tags—say, 'v2' or maybe 'latest'—and that ensures the results are always consistent and predictable. You won't run into the headache of an unpredictable output because the underlying model definition is locked down for your session.

When the computation finishes, you receive computed results instantly. The agent doesn't get a messy block of text; it gets the final, calculated output in a structured format that your client can read and act on immediately. This direct access to clean data means you can build complex decision-making paths within your conversational AI, making those predictions part of the ongoing dialogue.

The `get_inference` tool handles everything from execution across multiple ML frameworks—Python, PyTorch, Scikit-learn—to accepting highly structured inputs like JSON arrays. It ensures that every call is versioned and returns a clean, computable output ready for your agent to use in its next step. You're essentially making the model an active component of your workflow, not just something you mention passing data to.

This means if your application needs deep ML analysis—say, predicting stock movement based on historical JSON inputs or classifying images using a PyTorch-trained network—you don't need a separate API layer. You simply let your agent call `get_inference`, pass the structured payload, and get the computed prediction right back into the chat session. It cuts out layers of integration complexity, letting you focus solely on the logic that needs to happen.

## Tools

### get_inference
Calls a deployed Modelbit machine learning model with specific input parameters, returning structured predictions or computed outputs.

## Prompt Examples

**Prompt:** 
```
Call the 'sales_forecast' model with data: {'region': 'north', 'month': 12}.
```

**Response:** 
```
I've sent the request to the 'sales_forecast' deployment. The model predicts a revenue of $450,000 for the North region in December.
```

**Prompt:** 
```
Get an inference from 'image_classifier' version 'v2' for this input array of pixel values.
```

**Response:** 
```
Using version 'v2' of 'image_classifier', the model has identified the object as 'high-resolution satellite imagery' with 98% confidence.
```

**Prompt:** 
```
Run the 'fraud_detection' model on the latest transaction data.
```

**Response:** 
```
I've executed the `get_inference` tool for 'fraud_detection'. The model flagged the transaction as 'low risk' (score: 0.02).
```

## Capabilities

### Execute Production Models
The tool runs models built with various frameworks (Python, PyTorch, Scikit-learn) through a single call.

### Pass Structured Data
You send complex JSON objects or arrays directly to the model for processing.

### Enforce Version Control
Specify exact model versions or tags (e.g., 'v2' or 'latest') ensuring results are always reproducible.

### Receive Computed Results
The agent gets the final, calculated output from the model instantly in a structured format.

## Use Cases

### Detecting Fraud in Transactions
A user asks, 'What's the risk score on this transaction?' The agent runs `get_inference` using the 'fraud_detection' model. It passes the transaction details (amount, time, region) as a JSON object and gets back a clear risk assessment, like 'low risk (score: 0.02).'

### Classifying Satellite Imagery
You need to know what's in an image. The agent uses `get_inference` on the 'image_classifier' model, passing an array of pixel values. It returns a specific identification and confidence level (e.g., 'high-resolution satellite imagery with 98% confidence').

### Forecasting Sales Revenue
A PM asks the agent to predict sales for Q4 North region. The agent calls `get_inference` on the 'sales_forecast' model, feeding it {'region': 'north', 'month': 12}. It immediately replies with a calculated revenue target: '$450,000.' 

### Validating Product Data
You give the agent raw data and ask it to validate it against your proprietary schema. The agent uses `get_inference` on a validation model, passing the JSON record. It returns whether the data is valid or lists exactly which fields failed.

## Benefits

- No custom wrapper code. You connect your agent to production models and run them immediately via `get_inference`. This saves time building boilerplate API integration layers.
- Reproducible outputs are guaranteed. By specifying model versions or tags (e.g., 'v2'), you ensure the results never change unexpectedly, which is critical for testing.
- Handles complex data natively. You can pass structured inputs—like arrays of pixel values or multi-field JSON objects—straight to the model, letting it do the math.
- Supports major frameworks. Whether your model uses PyTorch, Scikit-learn, or pure Python, Modelbit exposes it through one unified endpoint for `get_inference`.
- Test proprietary logic instantly. You can showcase custom ML algorithms inside an AI chat interface without having to deploy a separate web service just for testing.

## How It Works

The bottom line is: your AI client runs complex ML logic without needing custom API wrappers or external code execution.

1. Subscribe to the Modelbit server and provide your workspace credentials.
2. Tell your AI client (your agent) to run `get_inference`, specifying the exact model name, version tag, and the input data payload.
3. The agent executes the call, and you receive the computed prediction or result directly in the chat interface.

## Frequently Asked Questions

**How do I set up get_inference for my first time?**
You subscribe to the server and enter your Modelbit Workspace name. If your models are private, you'll also need to provide your API Key in the setup panel.

**Does get_inference support different model types (PyTorch vs Scikit-learn)?**
Yes. The server is built to connect to any deployed ML framework—Python, PyTorch, Scikit-learn, etc.—as long as it's exposed via Modelbit.

**What data format must I use with get_inference?**
You must pass structured data. This means using JSON objects or arrays for the input payload when calling the tool, not just plain text.

**Can I test a model version before deploying it?**
The `get_inference` tool supports versioning. You can specify tags (like 'v2') to ensure you are always testing against a known, stable model iteration.

**What happens if an ML model fails or encounters bad data when I use get_inference?**
The agent receives a structured error message. Modelbit reports specific failure codes and stack traces, telling you exactly which part of the input failed. This lets your AI client retry the call with corrected parameters.

**How do I secure my model calls when using get_inference in production?**
You must use a private Modelbit API Key for secure deployments. By entering this key, you restrict access to your specific workspace and models. This keeps proprietary logic protected from unauthorized client connections.

**Are there limitations on the size of data I can pass to get_inference?**
While Modelbit handles complex JSON and arrays, input size depends on the model's specific requirements and general platform limits. For extremely large datasets, consider chunking the data or using a dedicated data pipeline before calling `get_inference`.

**What factors affect the latency when I run get_inference?**
Latency is determined by three things: network speed, Modelbit's processing time, and the model itself. Complex models or massive input arrays will naturally take longer to compute than simple predictions.

**Can I specify which version of a model to use for inference?**
Yes. When using the `get_inference` tool, you can provide an optional `version` string (e.g., 'v1', 'latest', or a specific tag) to target a precise deployment.

**What format should the input data be in?**
The `get_inference` tool accepts a `data` parameter which should be a JSON object or array, matching the input schema expected by your Modelbit deployment.

**Is an API Key required for all models?**
The `MODELBIT_API_KEY` is optional. It is only required if your Modelbit deployment is private. Public deployments only require the `MODELBIT_WORKSPACE` name.