Integrate Braintrust with Claude, Cursor, Chatbots & AI Agents MCP Server

Q: Does it pull out original Prompt definitions stored securely?

Certainly. The get_prompt command isolates and returns perfectly version-controlled bounding parameters slicing literal templates natively hosted under the Braintrust database.

Automate AI evaluations with Braintrust — organize projects, test model datasets, run benchmarks, and manage prompts via any AI agent.

GDPR Free for Subscribers

Compatible with every major AI agent and IDE

Claude

ChatGPT

Cursor

Gemini

Windsurf

VS Code

JetBrains

Vercel

+ other MCP clients

create

Create experiment on Braintrust

Establish a new historical experiment trace to record LLM pipeline tests

create

Create project on Braintrust

Create a new project environment for tracking AI evaluations and datasets

get

Get dataset on Braintrust

Retrieve a specific dataset containing exact schemas bounding LLM outputs

get

Get prompt on Braintrust

Retrieve exact variable contexts and literal text templates for a prompt

insert

Insert dataset row on Braintrust

Append new test cases into a dataset matrix targeting specific evaluations

list

List datasets on Braintrust

List isolated Ground Truth text banks used for automated evaluation scoring

list

List env vars on Braintrust

Probe the Braintrust AI Gateway configurations managing model API keys securely

list

List experiments on Braintrust

Retrieve all evaluation experiments mapping model test scores and metrics

list

List projects on Braintrust

Retrieve the list of all AI evaluation projects in Braintrust

list

List prompts on Braintrust

Retrieve explicitly version-controlled system prompts isolated in Braintrust

Security & Code Integrity Audit

Every tool in the Braintrust MCP Server is continuously audited by the Vinkius Security Engine. We guarantee zero-trust payload isolation, strict data boundaries, and deterministic execution for enterprise-grade AI agents.

A+Score: 100

How Vinkius protects your data

Can I set different limits for each virtual assistant on my team?

Absolutely. You have full control in our command center. You can create an AI agent that only "reads" data so the support team can answer questions, and another superpowered agent that can "edit" and "create" information exclusively for your operations team. Each AI gets exactly the level of access you allow.

Does it pull out original Prompt definitions stored securely?

Certainly. The get_prompt command isolates and returns perfectly version-controlled bounding parameters slicing literal templates natively hosted under the Braintrust database.

How does the AI access my passwords and credentials?

It simply doesn't. On Vinkius, your passwords, API keys, and login details are kept in a secure vault. The AI (like ChatGPT or Claude) merely "asks" Vinkius to perform the task. Vinkius opens the door, does the work, and hands the result back to the AI. Your credentials are never seen, read, or learned by the artificial intelligence.

What if the AI ends up reading customer data or confidential information?

We have a built-in digital "bodyguard" called DLP (Data Loss Prevention). If a tool fetches data and the response contains social security numbers, credit cards, or personal customer info, Vinkius magically blocks and erases that information before it is delivered to the AI. The AI works only with what is strictly necessary, and your sensitive data never leaks.

What can AI Agents do with Braintrust?

We map standard API endpoints to agent-compatible instructions. Connect Braintrust to execute these core functional operations.

Connecting ai evaluation with Cursor

The Braintrust toolkit provides structured tools for ai evaluation. It enables conversational interfaces like Claude Code to query and modify data within your brain trust infrastructure.

LLM Orchestration for llm benchmarking

Build automated workflows involving llm benchmarking by connecting Braintrust. It provides Claude and ChatGPT with direct API hooks into your brain trust ecosystem.