Files
paliad/internal/db/migrations/090_backfill_deadline_rule_id.down.sql
mAi 09615ec48e feat(t-paliad-190): mig 090 — one-time fuzzy-match backfill
Phase 3 Slice 10 Step I (design §3.I + m's Q10 ruling). Binds legacy
paliad.deadlines.rule_id to deadline_rules.id via priority-ordered
fuzzy matching; ambiguous + no-match rows log to the orphan staging
table (mig 089).

Matching strategies (highest priority first; first unique hit wins):

  1. rule_code_and_tail — title's leading citation token AND its
     post-separator name fragment match a rule. Handles
     "RoP.023 — Klageerwiderung" where the bare code matches 2 rules
     (DE Klageerwiderung + EN Statement of Defence); the tail picks
     the right one.

  2. rule_code only — bare rule_code from the title prefix. Handles
     "RoP.029.a — Replik" where RoP.029.a maps to a single rule
     regardless of suffix (the title's "Replik" doesn't match the
     rule's actual name but the code is unique).

  3. name_exact — full title equals rule.name or rule.name_en
     (LOWER). Catches "Antrag auf Schadensbemessung" (1 unique
     rule); ambiguous for shared names like Klageerwiderung (8
     candidates).

  4. concept_alias — title appears in deadline_concepts.aliases.
     Thin coverage today; Slice 12 orphan-seed will populate it.

Per-deadline aggregation:
  - Strategy with n_candidates = 1 wins. Priority chain rule_code_and_tail
    > rule_code > name_exact > concept_alias.
  - Ambiguous (≥2 across all strategies) → orphan reason='ambiguous'
    with the full candidate_rule_ids list.
  - 0 candidates → orphan reason='no_match'.

Predicted production outcome (verified via supabase MCP pre-write):
  - 3 of 25 deadlines (12%) get a unique match:
      "RoP.023 — Klageerwiderung"   via rule_code_and_tail
      "RoP.029.a — Replik"          via rule_code
      "Antrag auf Schadensbemessung" via name_exact
  - 15 of 25 deadlines (60%) → orphan reason='ambiguous' (common
    titles like Klageerwiderung × 4, Duplik × 4, Replik × 4 across
    multiple proceedings).
  - 7 of 25 deadlines (28%) → orphan reason='no_match' (free-text
    titles like "Call me", "Schutzschrift", "Validierungsfrist EP→DE",
    "Schriftsatz nach R.262 (Klageerwiderung)").

The 60% target the design § hinted at is unachievable on today's
corpus because all 11 projects have proceeding_type_id IS NULL post-
Slice-5 (the fristenrechner-side rebinding hasn't happened on
production data yet) — proceeding-narrowing would cut the
Klageerwiderung / Duplik / Replik ambiguity, but the column isn't
populated. The orphan-review UI in Slice 11 is the real path to
binding the long tail.

Defensive backup: paliad.deadlines_pre_089 snapshot taken before any
UPDATE. Down-migration restores rule_id from the snapshot + drops
unresolved orphan rows (resolved rows survive a rollback — those are
legal-review work that shouldn't disappear on a code revert).

Idempotency: WHERE rule_id IS NULL on the UPDATE; orphan INSERT
skips rows that already have an unresolved orphan entry. Re-running
on the same corpus produces no new rows.

Hard assertion: every NULL-rule_id deadline (with project) is either
resolved post-mig OR has an unresolved orphan row. RAISE EXCEPTION on
any unaccounted row — fails the migration loudly rather than
silently leaving a deadline un-matched + un-orphaned.

Audit-reason wrapper set; the mig 079 deadline_rules audit trigger
doesn't fire here (UPDATEs touch paliad.deadlines, not deadline_rules),
but the wrapper is the standard pattern.
2026-05-15 01:37:57 +02:00

31 lines
1.2 KiB
SQL

-- t-paliad-190 down — reverses 090_backfill_deadline_rule_id.up.sql.
--
-- Restores rule_id values from the pre-mig snapshot (every deadline
-- that mig 090 touched had rule_id IS NULL originally, so restoring
-- means setting rule_id back to NULL on every row that survived the
-- backfill). Drops the orphan rows mig 090 wrote (resolved rows stay
-- — those represent legal-review work that shouldn't disappear on
-- a code rollback) and drops the backup table.
--
-- This is a defensive rollback path; the migration itself is one-time
-- + idempotent, so re-running 090 after a down + up is safe.
SELECT set_config(
'paliad.audit_reason',
'rollback 090: NULL rule_id on deadlines mig 090 touched + drop pre-089 backup',
true);
-- Restore rule_id = NULL on every deadline mig 090 may have written.
-- We use the backup table as the authoritative "before" snapshot.
UPDATE paliad.deadlines d
SET rule_id = b.rule_id
FROM paliad.deadlines_pre_089 b
WHERE d.id = b.id;
-- Drop the unresolved orphan rows mig 090 wrote. Resolved rows stay —
-- a legal-review hand-link is real work that survives a code rollback.
DELETE FROM paliad.deadline_rule_backfill_orphans
WHERE resolved_at IS NULL;
DROP TABLE IF EXISTS paliad.deadlines_pre_089;