Deduplicate procedural_events with null.* synthetic codes — many true name-duplicates clutter /admin/procedural-events #144

Open
opened 2026-05-26 18:46:31 +00:00 by mAi · 1 comment
Collaborator

Repro

https://paliad.de/admin/procedural-events shows ~20 procedural_events whose names appear 2-6 times each, all with different null.<8hex> codes minted in mig 136 (B.1) per Q5/Q11 ratification. m: "this needs to be deduped".

Worst offenders (sample, from SELECT name, COUNT(*) FROM procedural_events WHERE code LIKE 'null.%' GROUP BY name HAVING COUNT(*) > 1 ORDER BY COUNT DESC):

name dup_count
Mängelbeseitigung / Zahlung 6
Antrag auf Patentänderung 2
Beginn des Hauptsacheverfahrens 2
Berufungsbegründung gegen eine in Regel 220.1(a) und (b) genannte Entscheidung 2
Berufungsschrift gegen eine in Regel 220.1(a) und (b) genannte Entscheidung 2
... (more) 2

Root cause

The original paliad.deadline_rules had 78 rows with NULL submission_code — cross-cutting orphan concepts (Wiedereinsetzung, Schriftsatznachreichung, etc.) that legitimately spanned multiple proceeding types or weren't bound to one. Mig 136's backfill minted a synthetic null.<uuid8> code per row, preserving 1:1 mapping. But the same legal concept (e.g. "Mängelbeseitigung / Zahlung") was represented by multiple deadline_rules rows because each rule's anchor / proceeding context differed, even though the user-visible name was identical.

Not all null.* codes are duplicates — only those whose name collides. The bulk (~50/78) are unique names referring to genuinely distinct events.

Fix

Scope

Migration 151 (or next free at write-time):

  1. Audit-first DO block — for every name whose null.* rows are >1, list the rows (id, name, sequencing_rules-pointing-here count, primary_party_default, concept_id) as a RAISE NOTICE. m can copy that output for the audit trail.

  2. Pick canonical per name — the row with the lowest created_at (or id UUID alphabetic) wins as canonical. Stable, deterministic.

  3. Reparent sequencing_rulesUPDATE sequencing_rules SET procedural_event_id = <canonical_id> WHERE procedural_event_id IN (<duplicate_ids>). Requires set_config('paliad.audit_reason', 'mig 151: consolidate duplicate null.* procedural_events per m/paliad#144', true) if sequencing_rules carries the deadline_rules audit trigger (verify before mig).

  4. Archive duplicatesUPDATE procedural_events SET is_active = false, lifecycle_state = 'archived', updated_at = now() WHERE id IN (<duplicate_ids>). Keeps the row + audit trail; just removes from UI.

  5. Post-checkRAISE EXCEPTION if any name whose null.* rows still > 1 AND is_active = true AND lifecycle_state = 'published' survives.

Hard requirements

  • Audit live schema FIRST via mcp__supabase__execute_sql: confirm procedural_events.updated_at exists (it does per the column list), confirm lifecycle_state allows 'archived' (it should), confirm there's no audit trigger on procedural_events that needs set_config.
  • sequencing_rules audit trigger: verify whether mig 140's drop of deadline_rules carried over the audit trigger to sequencing_rules. If yes, add set_config('paliad.audit_reason', ...) at the top.
  • Bulk UPDATE on sequencing_rules — make sure the INSTEAD OF triggers on the deadline_rules_unified view aren't fired by direct sequencing_rules writes (they only fire on view-targeted INSERT/UPDATE).
  • Pure data migration, no schema change.

Acceptance

  1. After mig 151 applies, every name in procedural_events has exactly one is_active=true AND lifecycle_state='published' row.
  2. SELECT name, COUNT(*) FROM procedural_events WHERE code LIKE 'null.%' AND is_active = true AND lifecycle_state = 'published' GROUP BY name HAVING COUNT(*) > 1 returns 0 rows.
  3. /admin/procedural-events page renders the consolidated list — visible row count drops from ~280+ to ~210 (estimated based on the ~20 collision groups × ~2-3 duplicates per group).
  4. No sequencing_rules row is orphaned — every rule's procedural_event_id still points at a non-archived procedural_event.
  5. Migration applies cleanly via the in-process boot path; container restarts cleanly post-deploy.

Anti-patterns

  • DON'T delete the duplicate procedural_events rows — archive them (is_active=false). Audit-trail safety.
  • DON'T use CASCADE — manual reparenting first, then archive.
  • DON'T touch the ~50 unique-name null.* rows — they're legitimate orphan concepts, not duplicates.
## Repro https://paliad.de/admin/procedural-events shows ~20 procedural_events whose names appear 2-6 times each, all with different `null.<8hex>` codes minted in mig 136 (B.1) per Q5/Q11 ratification. m: "this needs to be deduped". Worst offenders (sample, from `SELECT name, COUNT(*) FROM procedural_events WHERE code LIKE 'null.%' GROUP BY name HAVING COUNT(*) > 1 ORDER BY COUNT DESC`): | name | dup_count | |---|---| | Mängelbeseitigung / Zahlung | 6 | | Antrag auf Patentänderung | 2 | | Beginn des Hauptsacheverfahrens | 2 | | Berufungsbegründung gegen eine in Regel 220.1(a) und (b) genannte Entscheidung | 2 | | Berufungsschrift gegen eine in Regel 220.1(a) und (b) genannte Entscheidung | 2 | | ... (more) | 2 | ## Root cause The original `paliad.deadline_rules` had 78 rows with NULL `submission_code` — cross-cutting orphan concepts (Wiedereinsetzung, Schriftsatznachreichung, etc.) that legitimately spanned multiple proceeding types or weren't bound to one. Mig 136's backfill minted a synthetic `null.<uuid8>` code per row, preserving 1:1 mapping. But the same legal concept (e.g. "Mängelbeseitigung / Zahlung") was represented by multiple deadline_rules rows because each rule's anchor / proceeding context differed, even though the user-visible name was identical. Not all `null.*` codes are duplicates — only those whose `name` collides. The bulk (~50/78) are unique names referring to genuinely distinct events. ## Fix ### Scope **Migration 151** (or next free at write-time): 1. **Audit-first DO block** — for every `name` whose `null.*` rows are >1, list the rows (id, name, sequencing_rules-pointing-here count, primary_party_default, concept_id) as a `RAISE NOTICE`. m can copy that output for the audit trail. 2. **Pick canonical per name** — the row with the lowest `created_at` (or `id` UUID alphabetic) wins as canonical. Stable, deterministic. 3. **Reparent sequencing_rules** — `UPDATE sequencing_rules SET procedural_event_id = <canonical_id> WHERE procedural_event_id IN (<duplicate_ids>)`. Requires `set_config('paliad.audit_reason', 'mig 151: consolidate duplicate null.* procedural_events per m/paliad#144', true)` if sequencing_rules carries the deadline_rules audit trigger (verify before mig). 4. **Archive duplicates** — `UPDATE procedural_events SET is_active = false, lifecycle_state = 'archived', updated_at = now() WHERE id IN (<duplicate_ids>)`. Keeps the row + audit trail; just removes from UI. 5. **Post-check** — `RAISE EXCEPTION` if any `name` whose `null.*` rows still > 1 AND `is_active = true` AND `lifecycle_state = 'published'` survives. ### Hard requirements - **Audit live schema FIRST** via `mcp__supabase__execute_sql`: confirm `procedural_events.updated_at` exists (it does per the column list), confirm `lifecycle_state` allows `'archived'` (it should), confirm there's no audit trigger on procedural_events that needs `set_config`. - **`sequencing_rules` audit trigger**: verify whether mig 140's drop of `deadline_rules` carried over the audit trigger to `sequencing_rules`. If yes, add `set_config('paliad.audit_reason', ...)` at the top. - **Bulk UPDATE on sequencing_rules** — make sure the INSTEAD OF triggers on the `deadline_rules_unified` view aren't fired by direct sequencing_rules writes (they only fire on view-targeted INSERT/UPDATE). - Pure data migration, no schema change. ## Acceptance 1. After mig 151 applies, every `name` in `procedural_events` has exactly one `is_active=true AND lifecycle_state='published'` row. 2. `SELECT name, COUNT(*) FROM procedural_events WHERE code LIKE 'null.%' AND is_active = true AND lifecycle_state = 'published' GROUP BY name HAVING COUNT(*) > 1` returns 0 rows. 3. `/admin/procedural-events` page renders the consolidated list — visible row count drops from ~280+ to ~210 (estimated based on the ~20 collision groups × ~2-3 duplicates per group). 4. No `sequencing_rules` row is orphaned — every rule's `procedural_event_id` still points at a non-archived procedural_event. 5. Migration applies cleanly via the in-process boot path; container restarts cleanly post-deploy. ## Anti-patterns - DON'T delete the duplicate `procedural_events` rows — archive them (`is_active=false`). Audit-trail safety. - DON'T use CASCADE — manual reparenting first, then archive. - DON'T touch the ~50 unique-name `null.*` rows — they're legitimate orphan concepts, not duplicates.
mAi self-assigned this 2026-05-26 18:46:31 +00:00
mAi added the
done
label 2026-05-26 18:54:54 +00:00
Author
Collaborator

Mig 152 + Proceeding column ready for merge (t-paliad-321)

Branch: mai/curie/coder-mig152-clone-dedupe

feat(db): mig 152 — dedupe identical sequencing_rule clones (4cd28bc)

Follow-up to mig 151. The 6 "Mängelbeseitigung / Zahlung" sequencing_rules left active after mig 151 are byte-for-byte clones (proceeding_type_id=NULL, rule_code=NULL, duration 14d, primary_party=NULL). Mig 152 archives 5, keeps the lexicographically-lowest UUID as canonical.

  • Audit-first PRE block surfaces every clone-group with canonical id + duplicate ids as a RAISE NOTICE so deploy logs show what's about to be touched (m can spot-check).
  • Snapshot paliad.sequencing_rules_pre_152 in same TX (precedent pre_091/093/095/098/140/151).
  • Reparent paliad.deadlines.sequencing_rule_id duplicate → canonical BEFORE archive (live data: 0 affected; defensive).
  • set_config('paliad.audit_reason', …) defensively — sequencing_rules has no audit trigger today (mig 151 §scope), but a future trigger inherits it.
  • POST assertions: expected archive count, zero clone groups remaining active+published, zero live deadlines pointing at archived rules.

Other name-duplicates (Antrag auf Patentänderung×4, Beginn des Hauptsacheverfahrens×2, Berufungsbegründung-R.220.1×2, Berufungsschrift-R.220.1×2) do NOT collapse under the full-signature query — they have distinct (proceeding_type_id, rule_code, duration, primary_party) signatures and are legitimately different rules per proceeding. Mig 152 leaves them alone.

feat(admin): add proceeding-type column to /admin/procedural-events list (6acb116)

Surfaces the 3-segment proceeding code (e.g. upc.inf.cfi) at position 2 of the rule-editor list so the 4 legitimately-distinct same-named groups are visually disambiguated without opening each row.

  • New LoadProceedingTypeCodes(ctx, rows) on RuleEditorService — single batch SELECT from paliad.proceeding_types, returns id → code map. No N+1.
  • adminRuleResponse gains proceeding_type_code field; wrapRuleListResponse populates it from the map.
  • Frontend: new column in position 2; replaces the legacy Verfahrenstyp column (showed code · name, which duplicated the new content).
  • i18n key admin.procedural_events.col.proceeding (DE: "Verfahren", EN: "Proceeding") + i18n-keys.ts union entry.

Expected mig 152 outcome on apply: 5 archived. If the deploy log NOTICE shows additional groups, surface to m before merging.

Build + vet clean. TestMigrations_NoDuplicateSlot passes.

## Mig 152 + Proceeding column ready for merge (t-paliad-321) **Branch:** [`mai/curie/coder-mig152-clone-dedupe`](https://mgit.msbls.de/m/paliad/src/branch/mai/curie/coder-mig152-clone-dedupe) ### `feat(db): mig 152 — dedupe identical sequencing_rule clones` ([`4cd28bc`](https://mgit.msbls.de/m/paliad/commit/4cd28bc)) Follow-up to mig 151. The 6 "Mängelbeseitigung / Zahlung" sequencing_rules left active after mig 151 are byte-for-byte clones (proceeding_type_id=NULL, rule_code=NULL, duration 14d, primary_party=NULL). Mig 152 archives 5, keeps the lexicographically-lowest UUID as canonical. - Audit-first PRE block surfaces every clone-group with canonical id + duplicate ids as a `RAISE NOTICE` so deploy logs show what's about to be touched (m can spot-check). - Snapshot `paliad.sequencing_rules_pre_152` in same TX (precedent pre_091/093/095/098/140/151). - Reparent `paliad.deadlines.sequencing_rule_id` duplicate → canonical BEFORE archive (live data: 0 affected; defensive). - `set_config('paliad.audit_reason', …)` defensively — sequencing_rules has no audit trigger today (mig 151 §scope), but a future trigger inherits it. - POST assertions: expected archive count, zero clone groups remaining active+published, zero live deadlines pointing at archived rules. Other name-duplicates (Antrag auf Patentänderung×4, Beginn des Hauptsacheverfahrens×2, Berufungsbegründung-R.220.1×2, Berufungsschrift-R.220.1×2) do NOT collapse under the full-signature query — they have distinct (proceeding_type_id, rule_code, duration, primary_party) signatures and are legitimately different rules per proceeding. Mig 152 leaves them alone. ### `feat(admin): add proceeding-type column to /admin/procedural-events list` ([`6acb116`](https://mgit.msbls.de/m/paliad/commit/6acb116)) Surfaces the 3-segment proceeding code (e.g. `upc.inf.cfi`) at position 2 of the rule-editor list so the 4 legitimately-distinct same-named groups are visually disambiguated without opening each row. - New `LoadProceedingTypeCodes(ctx, rows)` on RuleEditorService — single batch SELECT from `paliad.proceeding_types`, returns id → code map. No N+1. - adminRuleResponse gains `proceeding_type_code` field; `wrapRuleListResponse` populates it from the map. - Frontend: new column in position 2; replaces the legacy `Verfahrenstyp` column (showed `code · name`, which duplicated the new content). - i18n key `admin.procedural_events.col.proceeding` (DE: "Verfahren", EN: "Proceeding") + i18n-keys.ts union entry. --- **Expected mig 152 outcome on apply: 5 archived.** If the deploy log NOTICE shows additional groups, surface to m before merging. Build + vet clean. `TestMigrations_NoDuplicateSlot` passes.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: m/paliad#144
No description provided.