Compatible with every major AI agent and IDE
Create experiment on Braintrust
Establish a new historical experiment trace to record LLM pipeline tests
Create project on Braintrust
Create a new project environment for tracking AI evaluations and datasets
Get dataset on Braintrust
Retrieve a specific dataset containing exact schemas bounding LLM outputs
Get prompt on Braintrust
Retrieve exact variable contexts and literal text templates for a prompt
Insert dataset row on Braintrust
Append new test cases into a dataset matrix targeting specific evaluations
List datasets on Braintrust
List isolated Ground Truth text banks used for automated evaluation scoring
List env vars on Braintrust
Probe the Braintrust AI Gateway configurations managing model API keys securely
List experiments on Braintrust
Retrieve all evaluation experiments mapping model test scores and metrics
List projects on Braintrust
Retrieve the list of all AI evaluation projects in Braintrust
List prompts on Braintrust
Retrieve explicitly version-controlled system prompts isolated in Braintrust
How Vinkius protects your data
Can I set different limits for each virtual assistant on my team?
Absolutely. You have full control in our command center. You can create an AI agent that only "reads" data so the support team can answer questions, and another superpowered agent that can "edit" and "create" information exclusively for your operations team. Each AI gets exactly the level of access you allow.
Does it pull out original Prompt definitions stored securely?
Certainly. The get_prompt command isolates and returns perfectly version-controlled bounding parameters slicing literal templates natively hosted under the Braintrust database.
How does the AI access my passwords and credentials?
It simply doesn't. On Vinkius, your passwords, API keys, and login details are kept in a secure vault. The AI (like ChatGPT or Claude) merely "asks" Vinkius to perform the task. Vinkius opens the door, does the work, and hands the result back to the AI. Your credentials are never seen, read, or learned by the artificial intelligence.
What if the AI ends up reading customer data or confidential information?
We have a built-in digital "bodyguard" called DLP (Data Loss Prevention). If a tool fetches data and the response contains social security numbers, credit cards, or personal customer info, Vinkius magically blocks and erases that information before it is delivered to the AI. The AI works only with what is strictly necessary, and your sensitive data never leaks.
What can AI Agents do with Braintrust?
We map standard API endpoints to agent-compatible instructions. Connect Braintrust to execute these core functional operations.
Connecting ai evaluation with Cursor
The Braintrust toolkit provides structured tools for ai evaluation. It enables conversational interfaces like Claude Code to query and modify data within your brain trust infrastructure.
LLM Orchestration for llm benchmarking
Build automated workflows involving llm benchmarking by connecting Braintrust. It provides Claude and ChatGPT with direct API hooks into your brain trust ecosystem.
Braintrust. Runs on everything.
From IDE to framework. Every connection governed by Vinkius.
Anthropic's native desktop app for Claude with built-in MCP support.
AI-first code editor with integrated LLM-powered coding assistance.
GitHub Copilot in VS Code with Agent mode and MCP support.
Purpose-built IDE for agentic AI coding workflows.
Autonomous AI coding agent that runs inside VS Code.
Anthropic's agentic CLI for terminal-first development.
Python SDK for building production-grade OpenAI agent workflows.
Google's framework for building production AI agents.
Type-safe agent development for Python with first-class MCP support.
TypeScript toolkit for building AI-powered web applications.
TypeScript-native agent framework for modern web stacks.
Python framework for orchestrating collaborative AI agent crews.
Leading Python framework for composable LLM applications.
Data-aware AI agent framework for structured and unstructured sources.
Microsoft's framework for multi-agent collaborative conversations.
Explore More MCP Servers
View all →
MOBIDI
12 toolsManage your mobile advertising campaigns with real-time bidding, audience targeting, and performance analytics for app installs.

GPTZero
8 toolsDetect AI-generated text with confidence scores and highlight exactly which passages were likely written by a language model.

Splitwise
10 toolsConnect your Splitwise account to AI agents to track expenses, check balances, and manage shared groups.

Ember Climate
11 toolsAccess global electricity data — generation, demand, emissions, and capacity from Ember Climate's open energy API.
