impress/docs/database-consolidation-deployment.md
Emi Matchu f311c92dbb docs: add deployment checklist for database consolidation
IMPORTANT: This migration is BLOCKED until Impress 2020 is retired.

Created comprehensive deployment guide documenting:
- Why this migration is blocked (Impress 2020 uses openneo_id directly)
- Two paths forward: retire Impress 2020 (recommended) or coordinated update
- Complete step-by-step deployment checklist for when ready
- Rollback procedures
- Risk assessment and mitigations
- Success criteria and timeline estimates

This ensures we don't accidentally deploy this change before addressing
the Impress 2020 dependency.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-02 07:07:57 +00:00

8.3 KiB

Database Consolidation Deployment Guide

This document outlines the plan and checklist for consolidating the openneo_id database into the main openneo_impress database.

Current Status: BLOCKED

This migration cannot be deployed until Impress 2020 is retired.

The Problem

While the main DTI Rails app is ready to move to a single-database architecture, Impress 2020 still directly accesses both databases:

  • openneo_impress - For reading item, pet, and outfit data
  • openneo_id - For user authentication via GraphQL

If we consolidate the databases now, Impress 2020's authentication will break immediately, causing login failures for users accessing DTI through the Impress 2020 GraphQL API.

Path Forward

There are two options to unblock this migration:

  1. Complete the migration of remaining Impress 2020 dependencies back to the main Rails app
    • See docs/impress-2020-dependencies.md for current status
    • Primary remaining dependencies: GraphQL API for outfit data, image generation service
  2. Spin down the Impress 2020 service entirely
  3. Execute the database consolidation (steps below)

Option B: Coordinated Update (Complex)

  1. Update Impress 2020 to point to openneo_impress.auth_users instead of openneo_id.users
  2. Deploy both applications simultaneously during a maintenance window
  3. Execute the database consolidation

Recommendation: Option A is simpler and aligns with our long-term goal of fully consolidating back into the Rails monolith.


Deployment Checklist (When Ready)

⚠️ DO NOT EXECUTE UNTIL IMPRESS 2020 IS RETIRED

Prerequisites

  • Impress 2020 service is spun down and no longer accessing databases
  • All Impress 2020 dependencies have been migrated to main Rails app
  • Database backups are current and tested
  • Maintenance window scheduled (estimate: 30-60 minutes)

Phase 1: Deploy Write Lock

Branch: feature/consolidate-auth-database @ commit 604a8667

Purpose: Prevent writes to AuthUser table while keeping login/logout functional.

Steps:

  1. Deploy Phase 1 to production
  2. Verify:
    • Existing users can log in
    • Existing users can log out
    • Registration shows maintenance message
    • Settings updates show maintenance message
    • NeoPass connection shows maintenance message

Expected Downtime: None (read-only mode for account changes only)

Phase 2: Copy Data

Purpose: Copy auth data from openneo_id to openneo_impress while table is stable.

Steps:

  1. Backup openneo_id database:

    mysqldump -h [host] -u [user] -p openneo_id > openneo_id_backup_$(date +%Y%m%d_%H%M%S).sql
    
  2. Verify backup:

    # Check file size is reasonable
    ls -lh openneo_id_backup_*.sql
    
    # Spot-check contents
    head -n 50 openneo_id_backup_*.sql
    
  3. Run the migration:

    cd /var/www/impress
    bundle exec rails db:migrate
    
  4. Verify data copy:

    -- Connect to MySQL
    mysql -h [host] -u [user] -p
    
    -- Check row counts match
    SELECT COUNT(*) AS openneo_id_count FROM openneo_id.users;
    SELECT COUNT(*) AS auth_users_count FROM openneo_impress.auth_users;
    
    -- Spot-check a few records
    SELECT id, name, email FROM openneo_id.users LIMIT 5;
    SELECT id, name, email FROM openneo_impress.auth_users WHERE id IN (1, 2, 3, 4, 5);
    
    -- Verify indexes were created
    SHOW INDEX FROM openneo_impress.auth_users;
    
  5. Verify results:

    • Row counts match exactly
    • Sample records match (IDs, names, emails)
    • All 4 indexes created (email, provider+uid, reset_password_token, unlock_token)

Expected Downtime: None (still in write-lock mode)

Phase 3: Switch to New Table

Branch: feature/consolidate-auth-database @ commit 2c21269a

Purpose: Point AuthUser at consolidated table, restore full functionality.

Steps:

  1. Deploy Phase 2 to production

  2. Immediately test critical paths:

    • Login with existing account
    • Logout
    • Register new account
    • Update account settings (email, password)
    • Connect NeoPass (if available)
    • Disconnect NeoPass (if available)
  3. Monitor error logs:

    tail -f /var/www/impress/log/production.log | grep -i error
    
  4. Verify database queries are using auth_users:

    # Check recent queries in logs
    grep "auth_users" /var/www/impress/log/production.log | tail -n 20
    
    # Should see SELECT/INSERT/UPDATE on auth_users, NOT openneo_id.users
    

Expected Downtime: Brief (< 1 minute for deployment)

Rollback Plan: If critical issues found, revert to Phase 1 commit and restore openneo_id from backup.

Phase 4: Documentation Update

Branch: feature/consolidate-auth-database @ commit 9ba94f9f

Purpose: Update documentation to reflect single-database architecture.

Steps:

  1. Deploy Phase 3 to production
  2. Verify no errors

Expected Downtime: None

Phase 5: Database Teardown

Purpose: Remove the now-unused openneo_id database.

Steps:

  1. Wait 7 days to ensure no issues found in production

  2. Final backup:

    mysqldump -h [host] -u [user] -p openneo_id > openneo_id_final_backup_$(date +%Y%m%d_%H%M%S).sql
    
  3. Store backup offsite:

    • Upload to secure backup storage
    • Keep for at least 90 days
  4. Drop the database:

    DROP DATABASE openneo_id;
    
  5. Remove environment variable:

    • Delete DATABASE_URL_OPENNEO_ID from production environment config
    • Restart app to ensure it doesn't try to connect
  6. Update MySQL users:

    -- Remove openneo_id privileges from users
    -- (Already done in deploy/setup.yml for new deployments)
    

Expected Downtime: None


Rollback Procedures

If Issues Found After Phase 3

  1. Immediate rollback:

    # Revert to Phase 1 commit
    git checkout 604a8667
    bundle exec rails db:migrate:down VERSION=20251102064247
    # Deploy
    
  2. Restore openneo_id (if needed):

    mysql -h [host] -u [user] -p openneo_id < openneo_id_backup_[timestamp].sql
    
  3. Investigate issues before reattempting

If Data Corruption Detected

  1. Immediately restore from backup:

    # Drop corrupted auth_users table
    mysql -h [host] -u [user] -p -e "DROP TABLE openneo_impress.auth_users;"
    
    # Restore openneo_id if needed
    mysql -h [host] -u [user] -p openneo_id < openneo_id_backup_[timestamp].sql
    
  2. Revert to pre-migration code

  3. Review migration SQL before reattempting


Key Risks & Mitigations

Risk Impact Mitigation Status
Impress 2020 auth breaks HIGH - Users can't log in via I2020 Block deployment until I2020 retired ⚠️ BLOCKING
Data copy fails mid-migration HIGH - Incomplete auth data Wrapped in transaction, can rollback Mitigated
Production traffic during copy MEDIUM - Stale data Write lock prevents changes Mitigated
Schema mismatch between DBs MEDIUM - Migration fails Migration matches exact schema Mitigated
Indexes not created MEDIUM - Slow queries Verification step checks indexes Mitigated
Login tracking data loss LOW - Missing login stats Acceptable trade-off Accepted

Success Criteria

  • All existing users can log in
  • New user registration works
  • Settings updates work
  • NeoPass connection/disconnection works
  • No errors in production logs
  • Query performance unchanged
  • Database row counts match
  • All auth_users indexes present

Timeline Estimate

Total time: 30-60 minutes (after Impress 2020 retired)

  • Phase 1 deployment: 5 min
  • Phase 2 data copy: 5-10 min (depending on user count)
  • Phase 3 deployment + testing: 15-30 min
  • Phase 4 deployment: 5 min
  • Phase 5 teardown: 7+ days later, 10 min

Questions Before Proceeding

  1. Is Impress 2020 fully retired? If not, STOP.
  2. Do we have recent database backups? (< 24 hours old)
  3. Do we have a maintenance window scheduled?
  4. Have we announced the maintenance to users?
  5. Do we have rollback access ready?

Last Updated: November 2025 Status: Blocked on Impress 2020 retirement Branch: feature/consolidate-auth-database