9b8a865c5feed76c5ffb209862ffe5b9d97bffdc
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
| c901293c9c |
feat(cicd): Slice A — pre-deploy gate + role-split migration smoke
Adds .gitea/workflows/test.yaml that gates every push on `go build`, `bun run build`, `go vet`, the migration coordination check, and the role-split end-to-end migration smoke. On push to main + green, calls Dokploy's compose.deploy API and polls /health/ready until 200. t-paliad-282 / m/paliad#114. Design: docs/design-cicd-pre-deploy-gate-2026-05-25.md (inventor shift on mai/cronus/inventor-ci-cd-pre). Catches all three of today's outage classes: brunel (~13:20) slot collision -> TestMigrations_NoDuplicateSlot hermes (~16:05) dropped-col refs -> TestBootSmoke mig 129 (~14:56) 42501 ownership -> TestMigrations_EndToEndAsAppRole Snapshot approach. internal/db/testdata/prod-snapshot.sql is a pg_dump of youpc-supabase paliad schema + applied_migrations rows. CI restores this into a fresh `supabase/postgres:15.8.1.060` (same image, same role topology as prod) and runs ApplyMigrations as the `postgres` role (which is NOT a superuser on supabase/postgres, matching prod). Existing migrations are skipped (already in applied_migrations); only NEW migs from the PR run end-to-end. This sidesteps the fresh-DB idempotence debt in some historical migrations (mig 037 missing pg_trgm, mig 051 inner COMMIT) — those are tracked separately and don't block the gate. Sub-changes: - internal/handlers/handlers.go — new /health/ready endpoint distinct from /healthz. /healthz stays liveness (process alive, no DB); /ready is readiness (DB pool pings within 2 s). Returns 503 when svc or pool is nil (DB-less deploys are intentionally not-ready). svc.Pool added to handlers.Services, wired in cmd/server/main.go. - internal/db/migrate_test.go — TestMigrations_NoDuplicateSlot (pure unit, catches brunel) and TestMigrations_EndToEndAsAppRole (snapshot- gated, catches the 42501 class). - cmd/server/main_smoke_test.go — TestBootSmoke now also asserts /health/ready returns 503 with a nil svc. New TestHealthReady_Live asserts 200 against a live pool. - internal/db/migrations/024_rename_department_columns.up.sql and 027_rename_to_partner_units.up.sql — ALTER INDEX / ALTER POLICY exception handlers now catch undefined_object OR undefined_table OR duplicate_object. Old handler only caught undefined_object; Postgres raises undefined_table when source object never existed, and duplicate_object when destination already exists. The expanded handlers make these migrations truly idempotent across all plausible starting states. - Makefile — verify-mig-app, test-frontend, refresh-snapshot targets. refresh-snapshot pg_dumps youpc-supabase prod (needs PALIAD_PROD_DATABASE_URL), strips pg16 \restrict commands for pg15 restore compat, and filters applied_migrations rows to this branch's max on-disk version. - internal/db/testdata/README.md — explains the snapshot's purpose, refresh procedure, and how to verify locally. - docs/cicd-runner-setup-2026-05-25.md — one-time admin steps for registering a Gitea Actions runner on mriver and wiring DOKPLOY_TOKEN as a repo secret. Documents soft-launch plan per m's Q11.4 (keep Dokploy's autoDeploy=true webhook alive for one week, disable after the workflow has gated 5 successful deploys). Build clean. Full go test ./internal/... ./cmd/... green without TEST_DATABASE_URL. With TEST_DATABASE_URL + TEST_APP_DATABASE_URL set to a supabase/postgres scratch + snapshot restored: TestMigrations_NoDuplicateSlot, TestMigrations_EndToEndAsAppRole, TestBootSmoke, TestHealthReady_Live all pass. Live-DB service tests in internal/services/* fail under supabase/postgres 15.8 with a 42P08 parameter-binding error (unrelated to Slice A — tracked as a follow-up). |
|||
| 7a359989a9 |
feat(db): t-paliad-218 — gap-tolerant migration runner with applied-set tracker
Replaces the golang-migrate single-counter tracker with a hand-rolled runner over embed.FS that tracks applied state as a set in paliad.applied_migrations (version PK, name, applied_at, checksum). Closes the parallel-merge skip-hole the 2026-05-20 mig-103 incident exposed (m/paliad#44): a migration whose version is missing from applied_migrations runs on the next deploy regardless of which higher versions are already applied. Gaps are first-class. Slice 1 of the design at docs/design-migration-runner-applied-set-2026-05-20.md. All eight design decisions m-picked = inventor recommendation. Runner contract: - Ensure paliad schema → pg_advisory_lock(hash('paliad.applied_migrations')) → CREATE TABLE IF NOT EXISTS applied_migrations. - bootstrapFromLegacyTracker: if applied_migrations is empty and the legacy paliad.paliad_schema_migrations row is present and clean, INSERT rows 1..N for every on-disk version with checksum=NULL via ON CONFLICT DO NOTHING. Hard-fail if legacy tracker is dirty (operator must recover). - scanEmbeddedMigrations: hard-fail on two .up.sql files sharing a version prefix — the failure mode the post-mortem exposed. - checkNameAgreement: hard-fail on rename-after-apply mismatch (disk name for an already-applied version != DB name). - applyOne: SQL body + INSERT(version, name, now(), sha256(file_bytes)) in one transaction. All-or-nothing per migration. Checksums populated on apply for future drift detection; rows backfilled from the legacy tracker carry NULL (we can't fabricate a hash for what golang-migrate applied historically). Verify-on-deploy intentionally deferred to a focused follow-up — single if-block flip when m wants it. Up-only runner. .down.sql files stay in embed.FS as reference; manual roll-back path is psql + DELETE FROM paliad.applied_migrations WHERE version=N. Zero call sites for migrate.Down in the codebase today. Drops github.com/golang-migrate/migrate/v4 from go.mod (no other importers; verified via grep). Tests: - internal/db/migrate_test.go: TestMigrations_DryRun walks pending = on_disk \\ applied (read from paliad.applied_migrations, missing-table → empty set), runs each in BEGIN/ROLLBACK against the scratch DB. - cmd/server/main_smoke_test.go: TestBootSmoke asserts the applied set equals the on-disk set exactly (not just max-version-match) — catches the skip class the post-mortem documented. Dirty-flag check removed (rows are committed or absent, not 'dirty'). - All 45 service-test call sites of db.ApplyMigrations work unchanged (same signature, same fresh-DB behavior). Follow-up: mig 108_drop_legacy_trackers (DROP paliad.paliad_schema_migrations and public.paliad_schema_migrations) after one or two deploys of burn-in on this slice. |
|||
| 586ba29b86 |
feat(test): migration dry-run gate + boot smoke (Slice 1)
Slice 1 of docs/design-paliad-test-strategy-2026-05-19.md — the test infrastructure that would have caught mig 098 (digit-regex) and mig 099 (missing audit_reason) before the deploy hit prod. Three new files + one route addition: - Makefile: `make verify-migrations` (alias `verify-mig`) runs the per-migration dry-run + boot smoke against TEST_DATABASE_URL. Fails fast with a clear error if TEST_DATABASE_URL is unset so CI can't silently pass a missing env var. `make test` and `make test-go` cover the rest of the short / full Go suites. - internal/db/migrate_test.go (TestMigrations_DryRun): walks every pending *.up.sql in numeric order, applies each inside its own BEGIN..ROLLBACK transaction, fails on the first SQL error with the file name + Postgres error. "Pending" = greater than the scratch DB's current tracker version, so fresh-DB CI runs verify everything while developer scratch DBs only re-verify the new pending migration. Always non-destructive — the rollback runs even on success. - cmd/server/main_smoke_test.go (TestBootSmoke): boots the apply path end-to-end, asserts (a) db.ApplyMigrations returns nil, (b) the tracker advanced to the highest *.up.sql version on disk with dirty=false, (c) GET /healthz on the registered mux returns 200. The dry-run catches per-migration syntax errors; this catches the apply+bind path the container actually runs. - internal/handlers/handlers.go: adds a GET /healthz public route — a no-auth, no-DB liveness probe. Used by the boot smoke; also safe for any future orchestrator or uptime check. Both live-DB tests gate on TEST_DATABASE_URL and skip cleanly without it, matching the rest of paliad's live-DB test pattern. Verification: go build ./... clean, go vet ./... clean, go test -short ./internal/... ./cmd/... clean (all packages pass, live-DB tests skip), bun run build clean (2436 i18n keys unchanged). Per CLAUDE.md inventor → coder gate, NOT self-merged. |