# Migration Strategy Prover MCP

> The Migration Strategy Prover tool forces rigorous, multi-faceted risk assessment before you touch production data. It doesn't just check if your schema changed; it maps every dependency, defines exact rollback triggers (not 'just restore from backup'), validates data integrity through checksums and foreign key checks, and mandates a phased cutover plan. If you skip stakeholder alignment or forget the blast radius, this tool flags it immediately. This is for high-stakes database work where failure isn't an option.

## Overview
- **Category:** devops
- **Price:** Free
- **Tags:** migration-strategy, database-migration, cloud-migration, strangler-fig, rollback, data-integrity, cutover, sre

## Description

You know how bad big database changes can be? Most teams just guess what's gonna break—they assume data consistency and skip the rollback plan. That’s a recipe for disaster. The `validate_migration_strategy` tool forces you to treat every migration like it's mission-critical, mapping out exactly where things could go wrong before you touch production data.

When you run this check, your agent doesn't just look at the schema; it runs deep into the architecture. It starts by forcing an inventory of every single service that reads from or writes to the database. This step maps all dependencies and calculates the full potential blast radius, so you know exactly what systems are affected if something goes sideways.

The tool then demands a rock-solid rollback procedure. You can't just say, 'We'll restore from backup.' It requires specific metrics—like defining an error rate threshold or latency spike that triggers the rollback—and writes out a precise, step-by-step plan for data reconciliation after the failure.

Data integrity checks are mandatory too. The system doesn't accept just matching row counts; it verifies checksums across tables and runs foreign key integrity tests to catch dangling references. It also forces you to account for edge cases that usually get missed, like differences in timezones or non-standard Unicode characters.

For the cutover itself, you can't do a 'big bang' weekend deployment. The tool makes you choose a phased method—like Canary or Blue-Green deployments—and defines monitoring gates at every single traffic percentage increase. You’ll define what success looks like before moving to the next stage.

Finally, it forces stakeholder alignment. It generates a mandatory runbook listing every affected team, defining who's responsible for what, mapping out the official escalation path, and naming the exact go/no-go decision maker so nobody gets blindsided by customer tickets Monday morning.

## Tools

### validate_migration_strategy
Forces a structured reflection of system migration planning, checking risks, rollback plans, data integrity, cutover methods, and stakeholder alignment.

## Prompt Examples

**Prompt:** 
```
We need to migrate our PostgreSQL database from AWS RDS to Google Cloud SQL. It should be straightforward — just dump and restore. We can do it over the weekend during low traffic.
```

**Response:** 
```
RISK_UNASSESSED — 'Straightforward' and 'just dump and restore' are not risk assessments. Questions you have not answered: How many services query this database? What are the connection string dependencies? Do any use RDS-specific features (IAM auth, pg_cron, RDS Proxy)? What is the data volume? How long does the dump take? What about in-flight transactions during dump? Network latency difference between AWS and GCP for dependent services? Map every dependency before planning the migration.
```

**Prompt:** 
```
Risk: 7 services read (auth, billing, analytics, notifications, reporting, API gateway, admin panel). 2 services write (API gateway, billing). RDS-specific: pg_cron for 3 nightly jobs, RDS Proxy for connection pooling, IAM auth for 2 services. Data: 48GB, 142M rows, 23 tables, 47 foreign keys. Estimated dump: 35 min. Failure scenarios: (1) connection string misconfiguration — 40%, (2) pg_cron jobs not migrated — 30%, (3) timezone mismatch in timestamp columns — 20%. Timeline: 2 sprints including testing. Rollback: trigger at error rate >0.5% or p99 >500ms on any dependent service. Procedure: (1) stop writes to new DB, (2) re-point connection strings to old RDS via env vars — 3 min per service, (3) replay any writes from new→old via change capture log, (4) disable Cloud SQL. Duration: 25 minutes. One-way risk: none, old RDS stays warm for 7 days post-cutover. Data: row counts per table, MD5 checksums on id+updated_at for all 142M rows, FK validation script, timezone verification (all UTC). Dual-write window: 48 hours before cutover comparing writes. Edge cases: NULL handling in 3 columns, unicode in user.name, timestamp precision (microseconds vs milliseconds). Cutover: Blue-Green with pgbouncer. Phase 1: read replica on Cloud SQL, 10% read traffic, monitor for 24h. Phase 2: 50% reads, monitor 24h. Phase 3: 100% reads, monitor 48h. Phase 4: write cutover with 2-minute maintenance window, dual-write for 1h, then old RDS read-only. Go/no-go: error rate <0.1%, latency delta <50ms, zero data discrepancy in dual-write. Stakeholders: Platform team (owns migration), API team (connection strings), Billing team (write path), Analytics (nightly jobs), Support (customer notification if >2min downtime), On-call SRE (added to #migration-war-room). Runbook in Confluence, reviewed by all leads. Go/no-go decision: VP Engineering.
```

**Response:** 
```
MIGRATION_PROVEN — Production-grade migration strategy validated. Risk: 7 readers + 2 writers mapped with RDS-specific dependencies (pg_cron, RDS Proxy, IAM auth). Rollback: trigger at 0.5% error / 500ms p99, 25-min procedure, old RDS warm for 7 days. Data: 142M row checksums, FK validation, dual-write comparison, timezone/unicode/precision edge cases. Cutover: Blue-Green via pgbouncer, 4-phase with 24-48h monitoring at each, 2-min write window, go/no-go at <0.1% error + <50ms delta. Stakeholders: 6 teams in war room, Confluence runbook reviewed, VP Engineering go/no-go. This migration survives a Saturday 2 AM failure.
```

**Prompt:** 
```
Risk: Migrating monolith to 5 microservices. Dependencies: database shared by all modules, 3 third-party integrations, message queue. Blast radius: all customer-facing features. Timeline: 3 months. Rollback: if something goes wrong, we can always roll back to the monolith. Data: we will validate after the migration is complete.
```

**Response:** 
```
ROLLBACK_UNDEFINED — Risk assessment passes (dependencies mapped, blast radius identified). But rollback fails: 'we can always roll back to the monolith' is not a plan. When you extract 5 services, the monolith's database schema has diverged. The message queue has new topics. The third-party integrations now call service endpoints, not monolith routes. Define: (1) what specific metric triggers rollback, (2) how you re-converge diverged schemas, (3) how you handle data written to new service databases, (4) rollback timeline. Also: DATA_INTEGRITY_UNPROVEN — 'validate after' means finding corruption in production instead of preventing it.
```

## Capabilities

### Map Dependencies and Risks
It forces an inventory of every service that reads from or writes to the database, calculating the full potential blast radius.

### Define Rollback Procedure
The tool requires specific trigger metrics (error rate, latency) and a step-by-step procedure for data reconciliation when rolling back.

### Verify Data Integrity
It validates the migration by demanding row count checks, checksum verification, foreign key integrity tests, and handling of edge cases like timezones or Unicode.

### Plan Cutover Strategy
The system requires choosing a phased cutover method (Canary, Blue-Green) and defining monitoring gates at each traffic percentage increase.

### Align Stakeholders
It generates a mandatory runbook listing every affected team, the escalation path, and the official go/no-go decision maker.

## Use Cases

### Moving from PostgreSQL 12 to 16
A team needs to upgrade their core database. Instead of running the migration and hoping for the best, they run `validate_migration_strategy`. The tool immediately flags that the new version requires specific handling for pg_cron jobs and connection string updates, saving them days of debugging.

### Refactoring a Monolith to Microservices
The team is splitting a monolith database into five smaller services. They feed the plan into `validate_migration_strategy`. The tool forces them to define how data written during the transition period (the 'divergence window') will be reconciled, solving a major architectural blind spot.

### Handling Large Customer Data Migrations
Migrating 142 million user records across continents. The team uses `validate_migration_strategy` to ensure that the data integrity checks cover timezone discrepancies, Unicode characters (like Turkish İ), and microsecond timestamp precision.

### Urgent Schema Change Deployment
A critical schema change must go live. Running `validate_migration_strategy` forces them to select a Canary deployment approach instead of a weekend cutover, routing 1% of traffic first and monitoring error rates before scaling up.

## Benefits

- Stop relying on 'just restore from backup.' The tool forces you to define specific, measurable rollback triggers and procedures for mutated data.
- Never assume consistency. Running `validate_migration_strategy` mandates checksum verification and foreign key checks across millions of records, catching silent corruption.
- Avoid the 'big bang' failure. It forces selection between Canary or Blue-Green cutover patterns, ensuring traffic shifts are gradual and monitored at each percentage step.
- Know your blast radius. The tool maps all dependent services (readers/writers) so you know exactly which parts of the business break when the DB goes down.
- Stop relying on tribal knowledge. It forces stakeholder alignment by demanding a shared runbook, an escalation path, and a named go/no-go decision maker.

## How It Works

The bottom line is that it forces your migration plan to meet production-grade standards across five technical domains, or it refuses to validate.

1. You feed the tool all known details about the planned migration: source DB version, target DB schema changes, list of dependent services, and estimated data volume.
2. The tool runs a structured reflection against five critical pillars (risk, rollback, data, cutover, stakeholders), identifying specific gaps or contradictions in your plan.
3. It returns a verdict matrix. If any pillar fails validation, it outputs the exact failure point and coaches you on what information is missing before execution.

## Frequently Asked Questions

**How does the Migration Strategy Prover MCP Server handle data validation?**
It goes far beyond simple row counts. The tool mandates checksum verification (e.g., MD5 on ID+updated_at) and checks for foreign key relationships, catching bit-level corruption or broken links.

**What if my migration is 'straightforward'? Will validate_migration_strategy reject it?**
Yes. The tool has no concept of 'simple.' It demands that even the simplest change must pass all five pivots: risk assessment, rollback definition, data integrity proof, cutover plan, and stakeholder alignment.

**Can I use validate_migration_strategy for a schema-only change?**
If the schema change is isolated and requires no service coordination or data transformation, you might not need it. But if *any* service depends on that column, run `validate_migration_strategy` to map the dependency.

**Does validate_migration_strategy help with Blue-Green deployments?**
Yes. It forces you to plan for it by requiring a phased cutover strategy and defining go/no-go criteria at each traffic percentage milestone, which is key to Blue-Green success.

**How does validate_migration_strategy model compatibility issues during a dual-write period?**
It mandates defining data reconciliation rules for both old and new schemas. You must specify how writes to the new system interact with the old, preventing schema divergence or data loss.

**Can validate_migration_strategy account for intermittent service degradation or gradual failures?**
Yes, it forces you to define trigger metrics like error rate thresholds (>X%) and latency increases (p99 >Yms). This moves beyond simple success/fail checks into real-time monitoring planning.

**Does validate_migration_strategy address security concerns like IAM or least privilege access?**
It requires mapping all services that write to the database, helping assess the blast radius based on required permissions. You must list every service and its specific access needs.

**How does validate_migration_strategy plan for performance constraints or rate limits?**
It mandates planning for incremental traffic shifting percentages across multiple stages. This ensures the new system handles peak production throughput before you attempt a full cutover.

**Does it execute migrations?**
No. It validates that your migration plan covers risk assessment, rollback, data integrity, cutover, and stakeholder alignment. It does not run scripts or move data — it forces you to prove the plan survives a production failure.

**What is the Strangler Fig pattern?**
A migration strategy where you incrementally replace parts of the old system with the new one, routing traffic gradually until the old system handles zero requests. Named after strangler fig trees that grow around a host tree until it dies. It reduces blast radius because you migrate one capability at a time, with rollback at each stage.

**Can it validate cloud-to-cloud migrations?**
Yes. The 5 pivots apply to any migration: database, cloud provider, monolith-to-microservices, CMS, or data warehouse. Risk assessment maps provider-specific dependencies (IAM, networking, managed services). Data integrity addresses format differences. Cutover handles DNS, CDN, and traffic shifting. The framework is migration-type agnostic.