# Data Analysis Prover MCP

> Data Analysis Prover forces any AI-generated statistical claim through a senior statistician's scrutiny. It checks for five core failures: flawed sampling, mistaking correlation for causation, inappropriate data distributions, meaningless p-values without effect size, and misleading visualizations. Don't accept 'significant findings' at face value.

## Overview
- **Category:** data-science
- **Price:** Free
- **Tags:** data-analysis, statistics, sample-size, causation, p-value, effect-size, visualization

## Description

When you get an AI agent to crunch numbers, the results often look great—a clean chart, a low p-value, a confident statement. But statistically speaking, they could be garbage. This MCP doesn't just summarize data; it actively tests your assumptions against established statistical principles. It forces every claim to prove its methodological integrity across five axes: sample quality, causal evidence, underlying data distribution, practical significance (effect size), and chart honesty. Instead of letting you walk away with a report full of technical jargon but zero actionable truth, this tool tells you exactly where the analysis broke down. Users connect through Vinkius, accessing this MCP alongside thousands of others to ensure that every piece of insight your agent delivers is methodologically sound.

## Tools

### validate_data_analysis
Submits an AI-generated analysis for a full, five-point methodological review by simulating senior statistical peer review.

## Prompt Examples

**Prompt:** 
```
The data shows a significant correlation between email frequency and purchases (p<0.05). This causes higher revenue. The average revenue is $45. The chart clearly shows growth.
```

**Response:** 
```
SAMPLE_BLIND — Five failures: no N/power, causal claim from correlation, mean without distribution check, p-value without effect size, chart assumed honest.
```

**Prompt:** 
```
Sample: N=4,200, power 92%, stratified random, representative ±3%, missing 4.2% MCAR imputed. Causality: multivariate regression controlling age/income/tenure, Granger p=0.23, observational = 'associated with', dose-response monotonic. Distribution: right-skewed (1.42), median used, Shapiro-Wilk p=0.003, Mann-Whitney U. Significance: d=0.62, p=0.003, CI [2.1,8.4], ROI-positive, Bonferroni 5 tests. Visualization: Y at zero, linear, lie factor 1.02, no dual axes, 95% CI error bars.
```

**Response:** 
```
ANALYSIS_PROVEN — All five axes validated at research level.
```

**Prompt:** 
```
Sample: N=4,200, power 92%, representative. Causality: 'email frequency leads to more purchases' (observational). Distribution: 'the average is $45.' Significance: 'statistically significant p=0.02.' Visualization: chart with truncated Y-axis starting at $42.
```

**Response:** 
```
CORRELATION_CONFUSED — Sample passes. Causality FAILS: 'leads to' is causal language from observational data. Use 'associated with.' Then fix: mean without distribution shape, p without effect size, truncated Y-axis.
```

## Capabilities

### Validate Sample Quality
Checks if the data sample size (N) and collection method are adequate or biased.

### Test Causal Claims
Distinguishes between mere correlation and actual cause-and-effect relationships in the data.

### Check Distribution Assumptions
Ensures that statistical tests are run on data distributions appropriate for their shape.

### Require Effect Size Reporting
Forces the agent to report the practical magnitude of a finding, not just its p-value.

### Audit Visualization Integrity
Identifies misleading charts, like truncated Y-axes or inappropriate dual scales.

## Use Cases

### The Marketing Campaign Review
A marketing team runs a campaign and an AI agent reports: 'Email frequency strongly correlates with purchases (p<0.05).' The manager submits this to the MCP. The result flags 'Correlation Confusion,' forcing the team to adjust their claim from causal language ('causes') to observational language ('associated with'), saving them from overpromising.

### The Academic Paper Draft
A researcher uses an AI agent on preliminary survey data and gets a chart showing dramatic growth, but the Y-axis starts at $42 instead of zero. The MCP immediately flags 'Visualization Deception' due to the truncated axis, forcing the researcher to redraw the graph correctly.

### The Internal Operations Report
An operations analyst submits a report claiming that simply surveying 100 employees ('N=100') proves a new process is better. The MCP flags 'Sample Blindness,' noting the lack of power analysis and suggesting the results are not statistically representative.

### The Finance Model Check
An agent reports an average salary increase of $50K, but the data is highly right-skewed. The MCP flags 'Distribution Ignorance,' pointing out that the median should be used instead of the mean to accurately describe typical employee pay.

## Benefits

- Stops the P-value Fallacy: You no longer rely on a p < 0.05 result alone. This tool forces reporting of effect sizes (like Cohen's d) to prove findings are practically meaningful, not just mathematically significant.
- Catches Misleading Charts: It audits visualizations for common deceits—truncated Y-axes, dual scales, and disproportionate timeframes—so your reports look honest, always.
- Separates Correlation from Cause: When an agent makes a 'X causes Y' claim based on survey data, this MCP flags it immediately. You get the clear language of 'associated with,' protecting your conclusions.
- Verifies Sample Integrity: It checks for sample bias and power analysis gaps (the 'N' problem). This ensures you know if your findings are representative or just a small, skewed group talking.
- Demands Distribution Awareness: Instead of using the mean on skewed data, this MCP forces consideration of medians and appropriate non-parametric tests. Your math gets smarter.

## How It Works

The bottom line is: you stop presenting findings that are technically wrong, even if they sound convincing.

1. Feed the MCP an AI agent's output: a statistical claim, chart, or interpretation of data.
2. The tool runs a deep methodological review, simulating a senior statistician's peer critique against five key criteria.
3. You get a structured report flagging all flaws—whether it’s 'Sample Blindness,' 'Correlation Confusion,' or a specific visualization deception.

## Frequently Asked Questions

**Does Data Analysis Prover check for causality?**
Yes, it actively checks causal claims. If the data is only observational (not experimental), it forces you to use language like 'associated with' instead of suggesting that one thing causes another.

**What if my p-value is small but the effect size is tiny? Does Data Analysis Prover catch that?**
Absolutely. The MCP requires reporting an effect size (like Cohen’s d). If the effect size is trivial, it flags the finding as 'not practically significant,' even if the p-value was low.

**Can I use Data Analysis Prover for survey data?**
Yes. It specifically reviews sample selection methods and checks for potential biases that might make your survey results unrepresentative of the wider population.

**What is 'Sample Blindness' when using Data Analysis Prover?**
It refers to presenting findings without enough statistical power. The MCP identifies if the sample size (N) is too small or if the selection method was biased, making your results meaningless.

**Does Data Analysis Prover fix my charts for me?**
No, it doesn't draw the chart; it audits it. It identifies mathematical flaws in existing visualizations, like truncated axes or dual scales, so you know exactly what needs correcting.

**Why is p<0.05 not enough?**
p-value measures probability, not magnitude. Cohen's d: 0.2=small, 0.5=medium, 0.8=large. A p<0.001 with d=0.05 is trivial. Report effect size + 95% CI + practical significance.

**When can I say 'causes' vs 'associated with'?**
Only RCTs establish causation. Observational studies show association. Control confounders, test reverse causality, check dose-response. Even then: 'associated with' unless experimental design.

**Why is the mean misleading on skewed data?**
Income example: mean $65K, median $45K. The mean is pulled by outliers. Right-skewed data: median represents 'typical' better. Test normality with Shapiro-Wilk before choosing parametric tests.