Workflow shifted to hand-written numbered migrations; the audit-row SQL
export tool no longer has any consumers. Pure deletion — /admin/rules
and /admin/rules/{id}/edit stay; only the export-to-SQL flow goes.
Deleted:
- frontend/src/admin-rules-export.tsx
- frontend/src/client/admin-rules-export.ts
Removed:
- routes GET /admin/rules/export and GET /admin/api/rules/export-migrations
- handleAdminExportRuleMigrations + handleAdminRulesExportPage
- RuleEditorService.ExportMigrationsSince + ExportResult + sqlEscape helper
- build.ts entries (import, client bundle, dist HTML write)
- Sidebar "Regel-Migrations" nav item + "Migrations exportieren" button on /admin/rules
- all admin.rules.export.* + nav.admin.rules_export + admin.rules.list.export i18n keys (DE+EN)
- .admin-rules-export-* CSS rules (dead after page deletion)
Doc references in design-fristen-phase2-2026-05-15.md and
design-paliad-data-export-2026-05-19.md updated to mark the endpoint as
removed (acceptance #2 requires grep to return zero hits).
47 KiB
Paliad data export — Excel-first, scoped (org / project-subtree / personal)
Design: archimedes (inventor), 2026-05-19.
Task: t-paliad-214.
Branch: mai/archimedes/inventor-excel-data.
Status: READY FOR REVIEW — no code yet, awaiting m go/no-go on §11 open questions.
0. Premise check (live state, 2026-05-19)
Verified directly against the youpc Postgres paliad schema rather than against memory or older design docs.
Migration tracker. Latest applied is 100_ccr_visible_rule; next is 101.
Row counts (org-wide today):
| table | rows |
|---|---|
| users | 47 |
| projects | 11 |
| deadlines | 26 |
| appointments | 5 |
| parties | 0 |
| notes | 4 |
| documents | 0 |
| project_events (audit) | 93 |
| project_teams | 3 |
| approval_requests | 8 |
| approval_policies | 160 |
| checklist_instances | 4 |
| deadline_rules | 254 |
| user_views | 2 |
| partner_units | 11 |
A full org export today is < 600 rows of user content plus reference data — synchronous streamed download is plausible for every scope. We design for an order-of-magnitude head-room.
Auth. Passwords live in Supabase Auth (separate auth schema, not paliad). The paliad.users table has no password_hash column — so the "don't export credentials" rule from the brief is enforced by absence, not by a column-deny list. Good.
Visibility. Row-level via paliad.can_see_project(project_id) (subtree-aware through ltree path). Already used as the predicate that gates every list endpoint. We reuse it for the personal and project scopes; the org scope bypasses it under global_admin.
Documents. Table exists, 0 rows. Phase H (AI Frist-Extraktion) is deferred per m's 2026-04-16 call. No ANTHROPIC_API_KEY on Dokploy. Therefore this design does not concern itself with binary attachments — only with the metadata row when documents start landing.
Audit trail. Lives in paliad.project_events (93 rows). One row per lifecycle event with event_type, metadata jsonb, event_date, created_by. The auditing union (AuditService.ListEntries) joins 5 sources (project_events, partner_unit_events, deadline_rule_audit, policy_audit_log, reminder_log). For the export we treat project_events as primary; the four auxiliary logs are scope-specific.
Existing export precedent. (Originally pointed at the admin rule-migration export. That tool was deleted in m/paliad#129 / t-paliad-297. The gating pattern — adminGate(users, …) on a download endpoint that streams a generated artifact — still lives on other admin handlers, e.g. handleAdminDownloadBackup for /api/admin/backups/{id}/file.) Re-use the gating helper.
No Go xlsx library on go.mod today. This design picks github.com/xuri/excelize/v2 in §3.
1. Why this exists
Two motivations, both load-bearing:
-
Safety / backup. A workbook on disk is a portable artifact independent of the running app. If paliad.de is down, a partner needs the matter file. If the Dokploy compose corrupts, IT needs a recent dump. If a deadline gets accidentally deleted, we want a recoverable snapshot.
-
No lock-in. A team or an entire org choosing to leave paliad must be able to walk away with their entire dataset in a format anyone can open. We promise this in writing as a trust signal — exactly because the alternative (silently locking customers in) is what we built paliad to not be.
The export is therefore not a "nice analytics feature" — it is a contractual guarantee that the data is yours. That framing shapes the design: completeness > convenience, portability > polish, every export auditable.
2. Scope definitions (precise)
Three scopes. The boundary is what the caller is allowed to see, joined with what makes the artifact interpretable standalone.
2.1 org scope
Caller: global_role='global_admin' only. There is no firm-admin role distinct from global_admin in paliad today (see §4).
Content: literally everything in the paliad schema that is user content or reference data the workbook needs to be readable. Specifically:
| sheet | source table(s) | notes |
|---|---|---|
projects |
paliad.projects (all rows) |
Full project tree including soft-deleted (status='deleted' / 'closed' if any). |
project_teams |
paliad.project_teams |
profession + responsibility (post-t-148). |
project_partner_units |
paliad.project_partner_units |
Derivation grants. |
deadlines |
paliad.deadlines |
Including completed, cancelled. |
appointments |
paliad.appointments |
Including completed. |
parties |
paliad.parties |
All client / opposing-party data. |
notes |
paliad.notes |
All four polymorphic targets resolved into the target_kind/target_id columns. |
documents |
paliad.documents metadata (file_path, file_size, mime_type, ai_extracted) |
Binaries excluded (open Q1). |
audit_events |
paliad.project_events |
Full audit trail per project. |
approval_requests |
paliad.approval_requests |
Including completed / rejected, with requester_kind + agent_turn_id. |
approval_policies |
paliad.approval_policies |
Both project-scoped and partner-unit-defaults. |
policy_audit_log |
paliad.policy_audit_log |
Source #5 of the audit union. |
partner_units |
paliad.partner_units |
Org chart. |
partner_unit_members |
paliad.partner_unit_members |
Including unit_role. |
partner_unit_events |
paliad.partner_unit_events |
Org-chart audit. |
checklist_instances |
paliad.checklist_instances |
Per-project completion state. |
invitations |
paliad.invitations (status, role, expires_at) |
Without raw tokens (open Q7). |
users |
paliad.users (id, email, display_name, office, profession, …) |
Excludes email_preferences jsonb only if it carries channel secrets — none do today, but checked at export time. |
user_views |
paliad.user_views |
Saved filters / custom layouts. |
user_card_layouts |
paliad.user_card_layouts |
Project-card layouts. |
user_pinned_projects |
paliad.user_pinned_projects |
Per-user pins. |
user_caldav_config |
paliad.user_caldav_config without the ciphertext column |
URL + calendar IDs + last_sync; passwords NEVER exported. |
reminder_log |
paliad.reminder_log |
Outbound digest history. |
caldav_sync_log |
paliad.caldav_sync_log |
Per-user sync runs. |
paliadin_turns |
paliad.paliadin_turns |
Excluded by default in org export (privacy — see §6) — admins opt in per Q5. |
email_broadcasts |
paliad.email_broadcasts |
Outbound broadcast history. |
email_templates + _versions |
both | Custom firm templates. |
| reference (read-only): | proceeding_types, event_types, event_categories, deadline_rules, deadline_concepts, deadline_concept_event_types, deadline_event_types, event_category_concepts, trigger_events, holidays, courts, countries |
One sheet per table, prefixed ref__. Embedded so the workbook is interpretable without paliad context. |
| deferred audit (admin opt-in): | deadline_rule_audit, policy_audit_log, partner_unit_events, caldav_sync_log, paliadin_turns |
Behaviour per Q5/Q6. |
Excluded unconditionally:
auth.*(Supabase Auth schema — not ours; the user can request their auth record from Supabase directly).paliad_schema_migrations(operational, no business meaning).*_pre_NNNshadow / pre-migration backup tables (rows are duplicates; the live table is canonical).- Any future
*_secret/*_tokencolumns (see §6 deny-list mechanism).
Edge cases:
- Soft-deleted rows: paliad currently has no soft-delete columns (
deleted_atetc.). When that lands, the org export includes them by default with adeleted_atcolumn populated. Until then, this is a no-op. - Archived projects:
projects.statuscan be'closed'or future'archived'— export includes them (the whole point of backup is recoverability of closed matters). - Counterclaims:
projects.counterclaim_ofis a self-FK. Export carries the column as-is; the relationship is reconstructable via theidcolumn.
2.2 project scope
Caller: any team member of the project who passes the §4 profession-tier gate.
Content: one project + all descendants along the ltree path. The descendant walk is WHERE path <@ root.path (subtree-inclusive of root). Every entity gets filtered through WHERE project_id IN (subtree_ids).
Per-sheet inclusion:
projects(root + descendants, one row each)project_teams(membership for those projects)project_partner_units(derivation attachments)deadlines,appointments,parties,notes,documents(metadata),audit_events,approval_requests,checklist_instances— all scoped to subtree- users sheet — restricted columns: only
id, email, display_name, office, professionfor users referenced by any FK in the export (created_by, assigned, etc.). Don't dump all 47 users when you only need 4. (Avoids accidental org-chart leak in a project-scope export shared externally.) - reference data:
ref__proceeding_types,ref__event_types,ref__deadline_rules,ref__deadline_concepts,ref__courts,ref__countries,ref__holidays. Same as org but a smaller universe is acceptable too — the v1 ships the full reference tables for simplicity (every row count is ≤ 300; size is moot). - Cross-project references (e.g., a party referenced by a project outside the subtree): out of scope by the predicate. The export carries the foreign UUID so a re-import or merge could re-link, but the foreign row itself is not in the workbook. Edge case is rare —
counterclaim_ofis the only known cross-project pointer today.
Edge cases:
- Partner-unit data:
partner_unitsis org-wide; project export carries only the unit ids attached viaproject_partner_units. The unit name + membership are loaded into the workbook onpartner_unitsandpartner_unit_memberssheets (filtered to the attached units only). - Policies:
approval_policiesrows include both project-scoped (the project + ancestors) and partner-unit-defaults attached to this project. Same MAX-of-sources logic as runtime. - Audit:
project_eventsfor the subtree + (admin opt-in only)deadline_rule_auditrows whose rule was used by any deadline in the subtree. Default off — these are firm-wide curation logs and don't belong in a per-project handoff.
2.3 personal scope
Semantics: "everything I can see right now in paliad, framed as my data."
That definition resolves the ambiguity in the brief: personal scope is not "rows where I am created_by" — that misses everything I see by being on a team. It is the RLS-visible projection of the schema for caller=me, plus a handful of explicitly-personal sidecars (caldav config, my pins, my views).
Per-sheet inclusion:
| sheet | rows |
|---|---|
projects |
WHERE paliad.can_see_project(id) for the caller. |
project_teams |
Rows where user_id = me OR the row's project is in my visible set. |
deadlines |
Same project-visibility filter. |
appointments |
Same. |
parties, notes, documents metadata, audit_events, checklist_instances |
Same. |
approval_requests |
Rows where requested_by = me OR decided_by = me OR project ∈ visible set. |
me (single-row sheet) |
Caller's users row (id, email, display_name, office, profession, reminder_*, lang, escalation_contact_id). |
my_caldav_config |
The caller's user_caldav_config row without the encrypted password column — sync URL, calendar IDs, last_sync_at. |
my_views |
Caller's user_views rows. |
my_pinned_projects |
Caller's user_pinned_projects rows. |
my_card_layouts |
Caller's user_card_layouts rows. |
my_paliadin_turns |
Caller's paliadin_turns rows (currently restricted to PaliadinOwnerEmail = m, so this sheet is empty for everyone else). Sensitive: AI prompts + responses. Default on for personal scope — it's literally the caller's data. |
users_referenced |
Restricted: id + display_name + email for users referenced as FKs in the export. |
| reference tables | Same set as project scope. |
Edge cases:
- Caller leaves a team: the export reflects the moment-in-time visibility. A
generated_attimestamp in the workbook header (__metasheet) anchors this. - Caller is a global_admin: their personal export is the entire org (because their visible set = all projects). This is by design — but we surface a banner ("Sie sehen alles, weil Sie global_admin sind. Ein org-scope-Export wäre identisch.") so they don't get confused thinking the personal-scope endpoint is broken.
- Caller has no team memberships: export contains the empty workbook + the
merow + their caldav config + views/pins. Still useful — they can save their preferences.
2.4 Common columns across all scopes
Every export workbook contains a __meta sheet:
schema_version: 1
firm_name: HLC # from internal/branding.Name
scope: org | project | personal
scope_root_id: uuid or NULL # the project id for project-scope, NULL otherwise
generated_at: 2026-05-19T14:23:00Z
generated_by_user: <uuid> <email> # the caller
generated_by_label: archimedes / m / ... # display_name
row_counts: JSON {"projects": 11, ...}
paliad_version: <git sha at server build>
notes: free-form, e.g., "documents binaries excluded by design"
This pins provenance + reproducibility + diffability.
3. Format choices
3.1 xlsx as the primary format
Library: github.com/xuri/excelize/v2. De-facto Go xlsx library, pure-Go (no cgo, no external libreoffice), MIT, streaming writer for large workbooks, broad format-feature support (number formats, freeze panes, hyperlinks, sheet hide). The streaming writer (NewStreamWriter) is what we use — it writes rows one at a time without holding the whole sheet in memory. At 11-projects scale this is unnecessary; at 11k-projects scale it's essential, so we set the pattern now.
Why not the alternatives:
tealeg/xlsx— older, unmaintained, no streaming.qax-os/excelize— same project as xuri/excelize (the github org renamed); xuri is the upstream.360EntSecGroup-Skylar/excelize— defunct fork.
Workbook structure: one sheet per entity type, never a mixed-type sheet with conditional columns. Reasons:
- Excel users sort + filter by column; a column that means "deadline due_date" on row 4 and "appointment start_at" on row 12 is unusable.
- The "self-describing" promise (no-lock-in) is satisfied by a workbook where every sheet is a flat table with stable column headers, not by a polymorphic blob.
- Cross-sheet relationships are represented by UUIDs in foreign-key columns + a
__lookupsheet pairing UUID → display label (project title, user email) for the workbook's lifetime. This makes the workbook self-joining in Power Query / pivot tables.
Sheet conventions:
- Sheet names use
snake_casematching SQL table names (deadlines, notFristen). Reference tables prefixedref__. Personal sidecars prefixedmy_. Meta sheet__meta. The__lookupsheet sits last. - Row 1 = column headers; frozen.
- Column 1 of every entity sheet is
id(uuid). - Dates: ISO 8601 UTC for timestamptz;
YYYY-MM-DDfordate. Always as Excel strings (not Excel date types) — Excel-date interpretation differs by locale (DE:Tag.Monat.Jahr, EN:Month/Day/Year) and silently corrupts on round-trip. A pinned ISO string is unambiguous and re-importable. Open Q4 covers whether to also mirror to native Excel dates for human convenience. - Booleans: literal
TRUE/FALSEstrings, same reason. jsonbcolumns: serialised as compact JSON one-liners in the cell. Cell type = string. Power Query canJson.Documentthem.- Arrays (e.g.,
additional_offices text[]): semicolon-joined string. Excel's CSV-array convention is the comma but our office codes use commas; semicolon avoids the collision. text[uuid[]]paths (the projects.path ltree): exported as the canonical dotted-uuid string.
Encoding: UTF-8 always. Excelize handles the xlsx packaging which is unicode-native. Umlaute round-trip correctly (verified pattern with tesla's CSV export in t-paliad-177).
3.2 CSV + JSON siblings
Per the no-lock-in promise, xlsx is not enough on its own — Excel is a proprietary format owned by Microsoft, and a workbook is opaque without a tool that understands it. For genuine portability we also produce:
- CSV: one file per entity sheet (no reference sheets — those go as JSON), UTF-8 with BOM (
\xEF\xBB\xBF) for Excel-DE compat, RFC 4180 quoting, headers row 1. Identical column shape to the xlsx sheet. - JSON: a single
paliad-export.jsonper scope, top-level{"meta": {...}, "tables": {"projects": [...], "deadlines": [...], ...}}. Easiest for programmatic re-ingest. Reference tables included.
Delivery shape: all three formats live inside one .zip per export:
paliad-export-<scope>-<timestamp>.zip
├── README.txt # human-readable: what this is, how to read it
├── paliad-export.xlsx # canonical workbook
├── paliad-export.json # JSON twin (machine-readable)
├── csv/
│ ├── projects.csv
│ ├── deadlines.csv
│ ├── ...
│ └── ref/
│ ├── proceeding_types.csv
│ └── ...
└── __meta.json # standalone meta (same content as __meta sheet)
The .zip is the artifact users download. Default content is "all three" — there's no UI knob to pick (open Q1: should there be? Inventor pick = no, zip-only).
Filename convention:
paliad-export-{scope}-{timestamp}.zip
scope = org | project-<root-short> | personal
timestamp = YYYY-MM-DDTHHMMZ # UTC, no colons (Windows-safe)
Examples: paliad-export-org-2026-05-19T1423Z.zip, paliad-export-project-Siemens-AG-2026-05-19T1423Z.zip, paliad-export-personal-2026-05-19T1423Z.zip. The project-short is slugify(root.title) capped 40 chars.
Determinism (Q6 question). Two exports of the same scope at the same row state must produce byte-identical workbooks. xlsx is internally a zip of XML — file order in the zip is significant; excelize's default zip writer is non-deterministic. We can make this deterministic by sorting the file list before writing. JSON: keys sorted alphabetically. CSV: rows ordered by id ASC (stable). The only inherently non-deterministic field is generated_at; we externalise it to the filename and the __meta sheet, but the rest of the workbook is byte-stable. Inventor pick: yes, deterministic. Lets users diff exports and prove "nothing changed between Tuesday and Thursday."
3.3 Future-proofing — schema_version
__meta.schema_version = 1. When we add columns (e.g., projects.archived_at lands), we bump to 2 and note the additions in a docs/export-schema-changelog.md. Importers (us in the future, or a re-importer at a different firm) read schema_version first.
4. Authorization model
Tightly mirrored to existing paliad role surfaces. No new roles introduced.
| Scope | Required auth |
|---|---|
org |
paliad.users.global_role = 'global_admin'. Same gate as /admin/* pages (auth.RequireAdminFunc in handlers.go:417). |
project |
Caller must (a) pass can_see_project(root_id), AND (b) have effective project profession ≥ associate on the root. The associate floor mirrors the conservative seed in approval_policies (t-154); paralegals + PA can see data but not extract it. m-tunable per Q2. |
personal |
Any authenticated user. No additional gate. |
Profession ladder check for project scope uses the existing DerivationService.EffectiveProjectRole (t-139 phase 2) — direct membership > ancestor > derived via partner-unit. Same surface that gates approvals; same surface gates extracts.
Audit row written on every export run. A new event_type into paliad.project_events for project-scope (so it appears on the project's Verlauf), partner_unit_events for org-scope (so it appears on the partner-unit audit log of the firm-admin's home unit), and policy_audit_log is too narrow — we likely want a new audit table for org-wide actions, OR we widen project_events to allow project_id = NULL org-wide rows. Inventor pick: new table paliad.system_audit_log — clean separation, integrates into the existing 5-source AuditService union as source #6. Migration 101 adds it.
system_audit_log columns:
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
event_type text NOT NULL, -- 'data_export'
actor_id uuid REFERENCES paliad.users(id),
actor_email text NOT NULL, -- captured at write time, survives user deletion
scope text NOT NULL, -- 'org' | 'project' | 'personal'
scope_root uuid, -- project_id for project scope, NULL otherwise
metadata jsonb NOT NULL DEFAULT '{}'::jsonb, -- {"formats":["xlsx","json","csv"], "row_counts":{...}, "file_size_bytes":12345, "filename":"..."}
created_at timestamptz NOT NULL DEFAULT now()
The audit row is written before the export runs (so failed exports are still recorded) and updated with file_size_bytes + final row_counts on success. Failure case: separate event_type='data_export_failed' row with the error string in metadata. The audit chain is the trust signal — m sees who exfiltrated what, when, and how much.
Headers on the response:
Content-Disposition: attachment; filename="paliad-export-<scope>-<ts>.zip"X-Paliad-Export-Audit-Id: <system_audit_log.id>— so an automated client can reference the audit row.
5. Trigger model
Three trigger surfaces:
5.1 On-demand button
- Personal:
/settings→ "Daten exportieren" card → button. POST/api/me/export→ 200 withContent-Type: application/zip. Synchronous. - Project:
/projects/{id}→ settings/cog menu → "Daten dieses Projekts exportieren". POST/api/projects/{id}/export→ 200 zip. Synchronous. Includes a "Inkl. Unterprojekte" toggle hint (it's always subtree-inclusive — the toggle is purely informational, no off switch). - Org:
/admin/data-export(new page, card on/admin) → "Org-Export erstellen" button. POST/api/admin/export/org→ async by default (see §6.1). Returns 202 +job_id. UI polls/api/admin/export/org/jobs/{id}for status.
Why org is async even at today's scale: the principle isn't "is it slow now" — it's "the trigger model should not change as the firm grows." If the partner with the firm-wide button gets a different UX from the associate with the project button, we'd retrofit later. Sync at 600 rows works fine; the wrapping is goroutine + channel + Server-Sent Events for live progress, no new infra needed. See §6.1.
5.2 Scheduled exports
Inventor pick — defer to slice 4. Out of v1 scope. The reasoning: scheduling sits on storage + delivery + retention, all of which are also deferred to slice 3+. Building the scheduler before we know how + where the artifact lives is premature.
When it lands (slice 4), the model is:
- A new
paliad.scheduled_exportstable:(id, scope, scope_root_id, owner_user_id, cadence, last_run_at, next_run_at, delivery)wheredeliveryis{kind: 'email-link' | 'caldav-style-webdav', config: jsonb}. - A daily cron (mai cron or a
time.Tickergoroutine) checksnext_run_at < now(), runs the export, posts the link via the configured delivery channel. - Cadence: weekly + monthly + on-status-change (e.g., "export when project closes" — a webhook from
projects.statustriggers).
For now (slice 1-2), users can right-click the on-demand button and bookmark the URL — that's the only scheduled-export-y thing we offer, and it's intentional: get the manual flow rock-solid before adding cadence.
5.3 API endpoint
Same endpoints as §5.1, callable directly with the standard cookie / bearer auth. We don't add a separate "API key" surface in v1 — paliad doesn't have personal access tokens today. If a user wants to script their personal export weekly, they can use cookie auth from m/paliad automation; that's enough until power-user volume justifies a real PAT surface.
For machine ergonomics: the /api/...export endpoints accept ?format=zip (default), ?format=xlsx, ?format=json, ?format=csv-zip query params. Only zip is documented; the others are internal but reachable for automation.
6. Storage + delivery
6.1 Synchronous vs async — per-scope picks
Personal, project: Synchronous, streamed. The handler holds the HTTP connection open, writes the zip directly to http.ResponseWriter. For 1MB-class exports (today's reality at every scale up to thousands of rows per entity) this is the right call — no persistence, nothing to garbage-collect, nothing leaking onto disk. Excelize's NewStreamWriter flushes rows as they're written so RAM stays bounded.
Org: Asynchronous, in-process queue, on-disk artifact.
- Submit (
POST /api/admin/export/org) writes asystem_audit_logrow with statuspendingand dispatches a goroutine. - The goroutine writes the zip to
/var/lib/paliad/exports/{audit_id}.zip(configurable viaPALIAD_EXPORT_DIR; on Dokploy this is a mounted volume). - The goroutine updates the audit row's metadata with progress, then status
donewithfile_size_byteson success. - The user polls
GET /api/admin/export/org/jobs/{audit_id}(SSE or simple JSON) — when ready, a download linkGET /api/admin/export/org/jobs/{audit_id}/downloadserves the file. - Download deletes the file by default (one-shot link), or keeps it per Q3.
Why not S3-style bucket? Paliad already has a documents table that will need a binary store, eventually. Coupling export storage to that future store is right — but the future store doesn't exist yet, and we don't want to provision MinIO on mlake purely for exports. Inventor pick: local disk in PALIAD_EXPORT_DIR until/unless we provision a real object store; at that point the export storage moves there transparently.
6.2 Retention (Q3)
Inventor pick: 7 days, then auto-delete. Justifications:
- Exports contain sensitive client data — minimising the retention window minimises blast radius if the Dokploy host is compromised.
- 7 days covers a holiday-week round-trip ("I exported Friday, want to look at it Monday next week, missed the day-1 link").
- The audit row in
system_audit_logpersists forever — you can always tell that an export happened, even after the artifact is deleted.
A cleanup goroutine runs daily, lists system_audit_log rows older than 7 days with non-NULL file_path, deletes the file, sets metadata.deleted_at. Audit row stays.
The PALIAD_EXPORT_RETENTION_DAYS env var is the knob (default 7). m-tunable per firm.
6.3 PII / GDPR
This is where the design gets serious.
At-rest encryption. Files in PALIAD_EXPORT_DIR are plaintext on the Dokploy volume. The volume itself is encrypted at the host layer (Hostinger VPS disk encryption). We do not layer additional file-level encryption on the artifact — that would require a per-user key, key escrow, key rotation, all of which is over-engineered for a 7-day-retention exfil where the link is single-use behind cookie auth. The disk encryption + 7-day TTL + audit log is the trust boundary.
In-transit encryption. TLS via Dokploy + Traefik — paliad.de is Let's Encrypt-served. No raw HTTP path.
Download authentication. The download link /api/admin/export/org/jobs/{audit_id}/download requires the same cookie auth as the submit. No public signed URLs in v1 (deferred per Q8). When we add scheduled exports + email delivery (slice 4), we'll need expiring signed URLs — that design is captured then, not now.
Data-subject requests. A user invoking /api/me/export is, in effect, performing a self-serve GDPR Art. 15 data-portability request. Audit row records the request. If the firm receives a third-party DSR ("export the data my client Mr. Müller asked for"), a global_admin can run a project-scope export filtered to projects involving that client; this is a manual workflow we don't automate in v1 (open Q9).
Right-to-erasure. Out of scope. Erasure is a write path; export is read-only. They share no code.
External sharing of export files. A user who downloads an export and emails it to an external party has done so on their own authority and outside paliad's protection. We don't watermark the file (debated and rejected: watermarking introduces non-determinism, breaks diffability, and gives false security — anyone reading the zip can strip metadata). What we do document in the embedded README.txt:
Diese Datei enthält möglicherweise vertrauliche Mandantsdaten. Sie wurde erzeugt am {generated_at} durch {actor_email} aus Paliad ({firm_name}). Die Weitergabe an Dritte erfolgt in eigener Verantwortung des Empfängers.
A simple "you broke the seal" notice is what we offer. It's a contract, not a control.
PII column deny-list. Hard-coded in internal/services/export_service.go:
paliad.users.password_hash— doesn't exist, but the deny-list is the safety net if it ever does.paliad.user_caldav_config.encrypted_password— explicit drop.- Any column whose name matches
(?i)secret|token|password|api[_-]?key|private[_-]?key— caught at column-discovery time, errors loudly intosystem_audit_log.metadata.warnings. paliadin_turns.assistant_response— present in personal export of caller's own data; off in org export by default (m's call per Q5).
6.4 GDPR-completeness note
The export of one user's personal scope is a partial Art. 15 disclosure — it contains what's in paliad's control. Other systems (Supabase Auth row, mlake logs, CalDAV provider) are out of paliad's scope and not in the export. The embedded README states this explicitly so the user knows the workbook is the paliad-side answer, not a complete personal-data dump from "the firm."
7. Slice plan
Tracer-bullet shipping. Each slice is independently shippable and reviewable. The first slice closes the no-lock-in promise for the smallest, lowest-risk scope; later slices widen.
Slice 1 — personal export, synchronous, xlsx + JSON
- Adds
excelize/v2togo.mod. - New
internal/services/export_service.gowith the column-discovery + writer plumbing for xlsx + JSON. - New
internal/handlers/export.gowithPOST /api/me/export. - New
/settingsUI: "Daten exportieren" card + button. - Migration 101:
paliad.system_audit_log+AuditService.ListEntries6th union branch. - i18n keys (
settings.export.*,__meta.*). - Tests:
export_service_test.gocovers xlsx structure (one row each kind), JSON shape, PII deny-list.
Ships the no-lock-in promise for every user immediately. ~600-800 LoC + ~25 i18n keys.
Slice 2 — project export, synchronous, xlsx + JSON + CSV-zip
- Generalises the export_service to scope-aware queries (the visibility predicate gets injected per scope).
- New
POST /api/projects/{id}/export, gated by §4. - Adds CSV writer alongside xlsx + JSON; bundles all three into
.zip. - Project-detail UI gets the export menu entry.
- README.txt template embedded.
- Tests + e2e (Playwright) on the project page button.
~800-1000 LoC. The CSV path generalises the xlsx column-discovery so the marginal cost is low. After this slice, two of three scopes are shipped and synchronous serves both.
Slice 3 — org export, async with job tracking
- Adds the goroutine + on-disk artifact path +
PALIAD_EXPORT_DIRenv. POST /api/admin/export/org+ job status + download endpoints.- New
/admin/data-exportpage (card on/admin/). - Cleanup goroutine (daily, deletes artifacts >
PALIAD_EXPORT_RETENTION_DAYS). - Refactor: extract the now-common "writeExportToWriter" core from the synchronous path so async re-uses it.
~600-800 LoC. After this slice, all three scopes ship + audit trail is complete.
Slice 4 — scheduled exports (deferred, not v1)
Designed in §5.2; building deferred until at least 2 firms ask. The contract surface is the scheduled_exports table + cadence + delivery channel.
Slice 5 — API ergonomics (deferred)
Personal Access Tokens (the "I want to cron my own export" surface). Until there's a customer, we don't build the PAT issuer + revocation + audit.
Slice 6 — GDPR DSR helpers (deferred)
A /admin/data-subject-request workflow to assemble a per-natural-person export across projects. Built on Slice 1-3 primitives; not blocked by them.
Slice 7 — document binary inclusion (deferred until documents have rows)
When the documents table starts holding real files, the export adds a documents/ subdir in the zip with the actual files, keyed by filename = {document_id}.{ext}. The metadata sheet links by id. Adds ~150 LoC + an env var for the file backend.
Critical-path slices for v1: 1 + 2 + 3. Everything after is layered, optional, m-prioritised when there's a real customer pull.
8. Trade-offs flagged
-
xlsx-first means we own the
excelizedependency forever. Mitigation: excelize is the canonical Go xlsx — replacing it would be a multi-thousand-LoC migration, but the upstream is healthy (MIT, 17k+ stars, monthly releases). Acceptable lock-in. -
Determinism (sorted file order, sorted JSON keys, row-id-ordered CSV) is implementation discipline, not a library default. Test that breaks if any future change introduces non-determinism is essential (helps reviewers + prevents regressions).
-
Synchronous personal + project means a runaway export can block a request goroutine for seconds. At today's data shape this is sub-second. Watchdog: a 30s context deadline on synchronous exports; over that, return 503 with "export too large — contact admin for async." Triggers slice 3 → slice 4 of the user's mental model.
-
Per-scope endpoints triplicate similar code paths. Mitigated by the shared
ExportSpecstruct + scope-aware predicate injection. Read carefully in code review — this is the place subtle scope leaks creep in. -
JSON twin is genuinely redundant for human users. It's there for the no-lock-in promise (a Python script can re-ingest without Excel). The cost is one extra file in the zip + one extra serialisation pass. Acceptable.
-
No diff tooling — yet. Determinism enables
diff -rbetween two extracted zips, but no in-app surface. Slice 4+ may layer "show me what changed between Monday's and Friday's export" once exports are scheduled and stored. -
paliadin_turnsprivacy default. Currently restricted toPaliadinOwnerEmailso the table is empty for every other user. Personal export carries them by default ("your AI history"); org export by default does NOT (admin opt-in via?include=paliadin_turns). When Paliadin opens past owner-only (post-API cutover), revisit. -
Reference-data inclusion bloats every export. 254 deadline_rules + 102 trigger_events + 56 concepts + … = ~1000 reference rows in every workbook regardless of scope. At zip-compressed sizes this is < 100KB and worth the standalone-interpretability. If the workbook gets too large later, ship reference data as a separate "paliad-reference-snapshot.zip" once + reference it from each export's README.
-
Org export volume at firm-scale. A 10k-project firm has ~50k deadlines and ~200k audit events. Even at 200 bytes/row average that's < 100MB — comfortable for the async path with 4GB Dokploy RAM. Threshold concerns kick in at 1M+ rows, which is firm-class-of-100-attorneys territory. Designed for, not blocked on.
-
Audit-log explosion. A nightly cron + 47 users self-exporting = 50 audit rows / day. At a year that's 18k rows. Still trivial. No retention on the audit chain (the artifact retention does NOT touch audit-log retention — the audit chain is the trust signal, see §4).
9. Recommended implementer
Single PR, layered slices 1 → 2 → 3 as separate commits. No DB-heavy migrations; the only schema add is system_audit_log (one table, one trigger if any). The hard work is in the writer abstraction.
- Slice 1: pattern-fluent Sonnet coder. ~600-800 LoC, mostly bookkeeping. Tests pin the shape.
- Slice 2: same hands as slice 1 (continuity matters here — the writer abstraction is set in slice 1 and the project scope generalises it).
- Slice 3: same hands again. The async path is its own subsystem but the writer is unchanged.
NOT cronus per memory directive 2026-05-06 (retired from paliad). NOT m — this is a coder task end-to-end.
10. Inventor → coder transition (GATED per project CLAUDE.md)
Per .claude/CLAUDE.md: design phase ends here. No code touches the tree from inventor. Head's mai-head skill gates the coder shift after m's go on §11 open questions.
When approved, the coder shift opens on mai/<coder-name>/data-export-slice-1 (fresh branch off main, NOT off the design branch — design doc commit is the only artifact this branch carries forward via cherry-pick).
11. Open questions for m
The brief lists 8 candidate questions. After live-state verification I've collapsed + sharpened to 9, each with an inventor pick + reasoning. Will be asked sequentially via AskUserQuestion (paliad dogma — no ## §X.Y markdown dump on m, per t-paliad-154 lesson).
Q1 — Bundle xlsx + CSV + JSON in one zip, or let user pick format?
Inventor pick: bundle all three in one zip, no UI knob.
Reasoning: the no-lock-in promise requires the JSON twin (Excel-independent re-ingest); the xlsx is the human-readable default; CSV is the universal lingua franca. Picking only one breaks the promise for some user. Bundle size at today's scale is < 1MB; even at firm-scale it's well under the email-attachment limit. The cost of a checkbox UI is more than the cost of three extra files.
Alternative: offer ?format=xlsx-only|json-only|csv-only query params for the API surface, default to bundle. Documented in README only. We do this in v1 anyway since multi-format is what generates the zip in the first place.
Q2 — Project-scope profession floor: associate (inventor pick) or member?
Inventor pick: associate floor.
A project export carries party names, addresses, decision-history, draft strategy notes. That's "I can write a paper for the partner" data, not "I can see the deadline calendar" data. Member is the bare-visibility tier (you got added to the team). Export is exfiltration — needs the next tier up.
Alternative: gate by responsibility ∈ {lead, member} (no profession floor, only the project-team responsibility check). Cleaner architecturally — separates the "can see" axis from the "can extract" axis using the same fields. Less restrictive in practice.
Worth choosing now because the gate text in the audit row mentions the tier.
Q3 — Org-export artifact retention: 7 days (pick) or 30 / 90?
Inventor pick: 7 days.
Default conservative. m-tunable per firm via env var.
Q4 — Excel dates: ISO strings only (pick) or also a mirrored native-Excel-date column?
Inventor pick: ISO strings only.
Native Excel dates are locale-poisoned (DE vs EN epoch interpretation flips, round-trip corruption when re-saved). ISO is the universal answer. Power users who want a sortable native-date column can derive it once in their workbook — but the canonical export stays unambiguous.
Q5 — paliadin_turns in org export: opt-in only (pick), or include by default?
Inventor pick: opt-in via ?include=paliadin_turns query.
Today it's m-only data (PaliadinOwnerEmail gate), so the privacy stakes are low — but the moment Paliadin opens beyond owner-only, the AI conversation history per user is the most sensitive personal data we carry. Setting the off-by-default precedent now means we don't accidentally start dumping it later.
Q6 — Deterministic byte-for-byte exports: yes (pick) or accept timestamp drift in zip metadata?
Inventor pick: yes, deterministic.
Lets users diff exports across time. Cost: ~50 lines of sort.Strings + a custom zip writer with stable ordering. Worth it.
Q7 — Invitation tokens in org export: drop them entirely (pick) or include as hash?
Inventor pick: drop entirely.
Tokens grant signup access. Including them in a backup creates a vulnerability surface — an exfiltrated backup could be used to sign up as someone-else with their pending invite. Hashing doesn't help because the hash is what the URL contains. The invitation row (recipient, role, expiry, sent_at) is in the export; the token is not. If you need to re-issue, you do so from paliad's invite UI.
Q8 — Public signed-URL downloads (for scheduled/email delivery): yes / not in v1 (pick)?
Inventor pick: not in v1.
Defer to slice 4. v1's download is cookie-authenticated only. Signed URLs are useful when the recipient is asynchronously notified (email link), which is the scheduled-export model — and that whole subsystem ships later.
Q9 — GDPR Art. 15 DSR helper UI: not in v1 (pick)?
Inventor pick: not in v1.
A global_admin can already assemble a DSR manually using project-scope exports filtered by client. v1 ships the primitives; v2 ships the workflow.
Closing question for m: implementer
Recommend pattern-fluent Sonnet for all three slices, same hands across (continuity matters for the writer abstraction). Specific name = your call.
12. m's decisions (addendum, 2026-05-19)
m walked the §11 questions live via AskUserQuestion. Results below — these supersede the inventor picks where they differ.
- Q1 — Bundle format: Bundle xlsx + JSON + CSV in one
.zipper export. ✓ matches pick. - Q2 — Project-scope floor: Any team member (
responsibility ∈ {lead, member}). ⚠ Deviation from associate-floor pick — m chose the looser axis-split gate. Implementation update for §4: project-scope auth becomes(a) can_see_project(root_id) AND (b) caller is on project_teams for the root with responsibility ∈ {lead, member}. The DerivationService profession check is dropped from the export gate; observers + externals + derived-only members still cannot extract.system_audit_log.metadatarecords the responsibility value the caller held at export time. - Q3 — Org-export retention: 90 days. ⚠ Deviation from 7-day pick. Implementation update for §6.2:
PALIAD_EXPORT_RETENTION_DAYSdefault flips from7to90. The cleanup goroutine still runs daily; the threshold is just longer. Audit row unaffected (still persists forever). - Q4 — Date format: ISO 8601 strings only. ✓ matches pick.
- Q5 — paliadin_turns in org export: Never include in org export. ⚠ Tighter than opt-in pick. Implementation update for §2.1 + §6.3: the
paliadin_turnsrow drops from the org-scope sheet table entirely — no?include=paliadin_turnsquery param. Personal scope still carries the caller's own paliadin_turns (it's literally their data). The hard exclusion is enforced inexport_service.go's scope-aware sheet registry, not just in column-discovery, so a future schema addition can't accidentally re-include it. - Q6 — Deterministic exports: Yes. ✓ matches pick. (m answered freeform "1" alongside the batching request — first option = deterministic.)
- Q7 — Invitation tokens: Drop entirely. ✓ matches pick.
- Q8 — Signed URLs in v1: Not in v1. ✓ matches pick.
- Q9 — GDPR DSR helper UI in v1: Not in v1. ✓ matches pick.
Net effect on slice plan: unchanged shape, three modifications:
- Slice 2 gate logic uses
project_teams.responsibilityonly (no profession lookup). - Slice 3 default retention is 90 days (one env-var value change).
- Slice 1 + 3 sheet registry omits
paliadin_turnsfrom org scope entirely.
No other slice deltas. v1 still ships slices 1+2+3.
Coder shift gating: head still gates the implementation handoff; m's decisions here close §11 but don't auto-trigger coder work.
13. Adjacent / out-of-scope
- Import path — explicitly out per brief. A round-trip "export then re-import" is appealing but is its own design (rebinding UUIDs, conflict resolution, schema_version migrations). Don't conflate.
- Postgres replacement — the Excel workbook is a backup + portability artifact, not a data-model alternative. Postgres stays canonical.
- t-paliad-212 (leibniz, CalDAV multi-calendar): personal export already carries the caller's caldav config (minus ciphertext). When leibniz designs multi-calendar, the personal export's
my_caldav_configsheet becomes a list rather than a single row — handled by column-discovery automatically. No design conflict; flagged for confirmation when leibniz's design lands. - t-paliad-213 (mendel, test strategy): export service warrants pure-function tests for column discovery, deny-list, scope predicate, plus one e2e (Playwright) per scope endpoint. Slice tests pin the contract; mendel's overall strategy decides framework choice.
14. References
docs/design-data-model-v2.md— projects + mandanten + ltree path + can_see_project predicate.docs/design-approval-policy-ui-2026-05-07.md— 5-source audit union (this design adds the 6th source).docs/design-profession-vs-project-role-2026-05-07.md— profession ladder for the §4 project gate.internal/handlers/backups.go—handleAdminDownloadBackup(precedent for admin-gated artifact download; the older rule-migration export precedent was removed in t-paliad-297).internal/services/project_service.go:15— visibility predicate.internal/services/derivation_service.go—EffectiveProjectRolefor the project gate.github.com/xuri/excelize/v2— chosen xlsx library.
END OF DESIGN. Status: READY FOR REVIEW.
Inventor parks until m's go/no-go on §11. No code touches the tree from this branch.