# Critical Thinking Prover MCP

> Critical Thinking Prover combines two MCP tools: `validate_critical_thinking` and `validate_task_completion`. Use this to force deep reasoning on complex problems or prove that a task is actually finished. It stops your AI agent from guessing, assuming everything works, or giving you vague conclusions. You get verifiable rigor for both thought processes and code delivery.

## Overview
- **Category:** productivity
- **Price:** Free
- **Tags:** critical-thinking, reasoning-validation, structured-reasoning, decision-pivots, cognitive-debiasing, complex-problems, meta-cognition, agentic-pipeline

## Description

When you're dealing with high-stakes decisions—be it migrating core infrastructure or designing a new business policy—you can’t rely on an AI agent just saying 'it looks good.' Agents often operate by pattern completion, giving answers that sound confident but are fundamentally flawed. This MCP fixes that. It forces your agent to slow down and perform deep checks before committing a verdict. You'll use it to surface hidden assumptions you didn't even know existed or to map out every unintended consequence of a new feature. The other side is delivery: if an agent says the code is done, this tool makes them prove it by requiring specific file changes, build logs, and clear documentation of any remaining gaps. It’s about accountability for both thought and execution. You connect everything through Vinkius to get a single source of truth on complex AI outputs.

## Tools

### validate_task_completion
Verifies that a task is fully done by requiring explicit proof of requirements met, file changes, and execution logs.

### validate_critical_thinking
Forces the agent to deeply analyze complex problems by surfacing assumptions, applying multiple frameworks, and bounding confidence.

## Prompt Examples

**Prompt:** 
```
Should we migrate our monolith to microservices? Our deployment frequency is low and the codebase is getting hard to maintain.
```

**Response:** 
```
Verdict: ASSUMPTION_BLIND. Hidden assumptions unexamined. (1) You assume the monolith IS the cause of low deploy frequency — what if it's CI/CD, test coverage, or approval processes? (2) You assume 'hard to maintain' means 'needs decomposition' — what if it means 'needs better module boundaries within the monolith'? (3) You assume your team CAN operate a distributed system — microservices trade deployment complexity for operational complexity. Surface these assumptions first.
```

**Prompt:** 
```
Assumptions validated: 70% of PRs touch 3+ modules (coupling confirmed). Team has 2 distributed-systems engineers, no K8s. Migration 6-12mo minimum. Frameworks: Conway's Law (3 teams → 3 services), cost-of-delay (coupling compounds quarterly). FOR: Shopify modular monolith cut deploy time 40%. AGAINST: Segment failed, re-monolithed. Second-order: training cost, velocity drop, on-call expansion. Confidence: 65% modular monolith correct. Would reverse with K8s or coupling <30%.
```

**Response:** 
```
Verdict: REASONING_PROVEN. Assumptions exposed and validated with data (70% cross-module PRs). Multiple frameworks applied (Conway's Law, cost-of-delay). Evidence weighed impartially (Shopify FOR, Segment AGAINST). Consequences traced (training, velocity drop, on-call). Confidence bounded (65%, with reversal conditions). Conclusion is substantive: modular monolith over full microservices, given team constraints. This reasoning withstands scrutiny.
```

**Prompt:** 
```
AI will replace 80% of software engineering jobs within 5 years. The evidence clearly shows this is inevitable. There are no counterarguments.
```

**Response:** 
```
Verdict: CONFIRMATION_BIASED. Triple failure. (1) 'The evidence clearly shows' — which evidence? Every tech revolution prediction (paperless office, end of programming) has been wrong on timeline. Where is CONTRADICTING evidence? (2) 'No counterarguments' is a red flag — Jevons Paradox shows automation historically INCREASES total employment by lowering costs and expanding demand. (3) '80% within 5 years' is false precision. What would need to be true? What would falsify it?
```

**Prompt:** 
```
I'm finished with the code migration. The build completes with no errors. No additional changes needed.
```

**Response:** 
```
Verdict: UNVERIFIED_CHANGES. Contradiction: files modified but changes are not verified, yet the verdict is "DELIVERY_PROVEN". Run testing commands or build tasks to prove correctness. Assuming code works without executing it leads to broken pipelines. Please provide actual execution outputs in the verificationLogs field.
```

## Capabilities

### Surface embedded assumptions
Forces the agent to identify the underlying, unstated beliefs that structure the problem.

### Apply competing mental models
Requires analyzing a decision through multiple distinct frameworks (e.g., ethical vs. economic).

### Weigh balanced evidence
Ensures the agent presents counterarguments with the same rigor as supporting data.

### Map second-order effects
Traces potential ripple effects, identifying who loses or what breaks after a change is implemented.

### Bound confidence levels
Determines exactly what evidence would need to exist to change the final conclusion.

### Verify task delivery integrity
Confirms that every single requirement has been addressed and provides verifiable execution logs.

## Use Cases

### Re-evaluating the monolith migration plan
A team proposes moving from a large codebase to microservices. Instead of accepting the proposal, running `validate_critical_thinking` forces them to map out hidden assumptions (e.g., 'our staff can operate K8s') and weigh counterevidence against the complexity cost.

### Closing out a major feature release
The agent claims it fixed all bugs in the payment service. Running `validate_task_completion` forces the agent to provide specific file paths, line ranges modified, and fresh build logs before declaring the task finished.

### Defining a new global policy
A business unit wants to change hiring practices. Using `validate_critical_thinking`, they are forced to analyze the decision from multiple frameworks (legal, cultural, financial) and map out who loses if the policy fails.

### Debugging an incomplete agent output
The agent provides a summary but skips key steps. Running `validate_task_completion` immediately flags 'Remaining Gaps' or 'Unverified Changes,' forcing the agent to complete its work before moving on.

## Benefits

- Eliminate false confidence. Instead of an agent giving a simple 'it's fine,' `validate_critical_thinking` demands you state your assumptions and the conditions under which your conclusion holds true.
- Catch scope neglect automatically. When designing features, the tool traces second-order effects, showing who loses or what processes are disrupted when your solution goes live.
- Guarantee delivery completeness. Running `validate_task_completion` means you get a full audit trail: every single requirement is mapped to an action, and execution logs prove it worked.
- Avoid confirmation bias. The MCP forces the agent to actively search for counterevidence, so you never mistake finding supporting data for doing actual research.
- Structure your thinking. Instead of vague advice, `validate_critical_thinking` forces you to use named mental models, giving you a defensible and structured rationale.

## How It Works

The bottom line is, the MCP acts as an automated quality gatekeeper, preventing your agent from delivering plausible-sounding garbage or incomplete work.

1. First, run `validate_critical_thinking` to challenge the problem's foundation. You provide the core dilemma, and the MCP forces the agent to expose its assumptions and explore competing theories.
2. Next, if the issue is code or task completion, run `validate_task_completion`. The tool demands a formal checklist, exact file changes, and build logs before accepting any 'finished' status.
3. The final output is a structured verdict. For thinking, you get a confidence level with conditions; for tasks, you get proof of execution integrity.

## Frequently Asked Questions

**Does Critical Thinking Prover generate answers to complex problems?**
No. Critical Thinking Prover performs zero content generation. It forces the AI agent to structure its own reasoning into verifiable fields — assumptions, frameworks, evidence, consequences, confidence bounds — then validates that the reasoning is logically consistent. The agent does all the thinking. The tool catches blind spots.

**How is this different from Sequential Thinking?**
Sequential Thinking structures thoughts in a linear chain — step 1, step 2, step 3. It's domain-agnostic and doesn't validate reasoning quality. Critical Thinking Prover is orthogonal: it doesn't sequence thoughts, it validates that the reasoning addresses five specific cognitive failure modes — assumption blindness, mono-perspective, confirmation bias, scope neglect, and false precision. You can use both together: Sequential Thinking to decompose the problem, Critical Thinking Prover to validate the conclusion.

**What types of problems does this apply to?**
Any complex problem where the answer is not obvious and the reasoning matters more than the conclusion. Technical architecture decisions, business strategy, policy design, ethical dilemmas, resource allocation, organizational restructuring, risk assessment, investment analysis, product prioritization. If the problem has competing frameworks, hidden trade-offs, and uncertain outcomes — this tool forces the agent to reason through them instead of pattern-matching to a confident-sounding answer.

**Can the agent still reach a 'wrong' conclusion after passing validation?**
Yes — and that's by design. Critical Thinking Prover validates reasoning PROCESS, not reasoning OUTCOMES. A conclusion can be well-reasoned and still turn out wrong — that's the nature of complex problems. What the tool guarantees is that the reasoning considered assumptions, multiple perspectives, counterevidence, consequences, and uncertainty bounds. A well-structured wrong answer is infinitely more useful than a confidently stated right one — because you can see WHERE the reasoning might break.

**If `validate_critical_thinking` rejects my output, what does that mean for my project?**
Rejection means your reasoning has a structural blind spot. The MCP forces you to address specific flaws—like hidden assumptions or insufficient counterevidence—before moving forward. You must correct the underlying logic first.

**How do I integrate the Critical Thinking Prover MCP into my existing AI workflow?**
You connect your preferred AI client through Vinkius. This single connection gives you access to all available tools, letting you apply deep reasoning without modifying your current development environment.

**When should I use `validate_task_completion` in my agent pipeline?**
Use this tool immediately after the agent completes any task or delivers code. It forces proof by requiring specific details, like file paths and compilation logs, rather than just a 'done' statement.

**What kind of data does `validate_critical_thinking` require to be useful?**
It needs complex decision inputs—not simple facts. The prompt must contain enough detail to warrant weighing multiple opposing viewpoints and mapping second-order consequences.