PRD: docforge — modular doc-generator engine for paliad + upc-commentary #157

Open
opened 2026-05-29 11:56:03 +00:00 by mAi · 1 comment
Collaborator

Direction (m, 2026-05-29 12:32–13:55)

I think we should modularize our "doc generator" to improve it better. There are different aspects to it and we may want to use them for other projects as well.

The result is clear: I want to be able to create and modify word documents, using variables inside the documents, "editing them live" and preview the results, export in the end. We should have all that modular to keep it clean. The editor is something else than the importing, exporting, variable exchange, data fetching etc..

upc-commentary / upc-kommentar would be one later use case. There we also need to work with changing templates and input text from our different format.

Go module is good enough for now.
Doc/docx is the most pressing — we should support export to all in the long run.

Currently I can't upload the base document to insert variables into to create a template — and then later I want to fill the template using data, modifying it manually where necessary, then exporting.

I want to keep the workflow clear and I feel that doing it with modules makes sense to fix single parts and not have different codebases.

Locked constraints (confirmed)

  • Packaging: one Go module (e.g. pkg/docforge — name bikeshed during PRD) with clean sub-packages. Same model as pkg/litigationplanner today.
  • Two consumers in scope: paliad (current submission generator) + upc-commentary (future, lives at UPCommentary/upc-kommentar). Design the resolver / importer abstractions with both in mind from day one.
  • Two distinct user surfaces (m's 2026-05-29 13:55 pick):
    • Authoring: upload a base .docx → place variable slots in it → save as a template. (This is the gap that doesn't work today.)
    • Generation: pick template → bind variables to data → manual edit where needed (live editor with preview) → export.
  • Format priority: .docx first (most pressing). Engine designed to be format-pluggable for .pdf / .html / .md later.
  • Editor stays per-consumer: shared engine + variable wire shape, different editor UI per surface.
  • Variable resolver shape: TBD (inventor judgment — m: "whatever works best"). Lean toward Go interface per namespace (type VariableResolver interface { Resolve(key string) (string, bool) }) for testability + clean separation.

Today's state — what to audit

  • internal/services/submission_*.go (~12 files) — current home for vars / merge / md / render / draft / base / section / building_block.
  • internal/handlers/submission_*.go (~5 files) — HTTP surface for the editor.
  • frontend/src/client/submission-draft.ts + submission-draft.tsx — editor UI.
  • Recent wins to preserve through the refactor: tonight's submission-md placeholder-underscore fix (commit b78a984 → main 1b4b2e4), the placeholderRegex contract (submission_merge.go:95), the live-preview-with-click-to-jump (data-var contract), the building-block + section model.
  • Known pain: no upload-and-annotate template authoring flow exists; markdown→OOXML walker has had subtle bugs (last night's underscore-strip); variable definition is hardcoded per resolver function (addProjectVars/addUserVars etc) rather than a clean interface.

Inventor task

Grill in prose FIRST — 3-5 core-metaphor questions before any AskUserQuestion batch. atlas's #152 + edison's #153 debriefs both flagged: structured chips with the wrong core-metaphor lock costs a doc rewrite.

Prose-grill candidates:

  • Module boundary lines: where exactly does engine end and variables begin? Do importers belong INSIDE engine (since they parse OOXML) or as a sibling? Tests of this seam: can a new format adapter (markdown importer, say) be added without touching engine internals?
  • Editor abstraction: is there ANY shared editor surface (a TS lib both consumers import), or are they fully separate UIs talking only to the engine via JSON? Cost of "share" vs "don't share" is real — paliad's section editor isn't upc-commentary's commentary editor.
  • Authoring surface scope: does the upload-and-annotate authoring UI use the same live editor as generation (just with placeholder-insertion mode toggled), or is it a separate annotation tool? Reuse vs purpose-built.
  • Template versioning: when m updates a template, what happens to existing drafts that use it? Snapshot-at-draft-create vs always-latest. Has migration implications.
  • upc-commentary's input format: what's the "different format" m mentioned? Markdown? Structured form? Pre-existing legal-commentary entry shape? The importer's pluggability hinges on this.

Then structured AskUserQuestion batches (4-Q max per batch, ~12-16 total) on hard UX/data/migration decisions.

Draft docs/plans/prd-docforge-2026-05-29.md:

  • §0 Premises (current state, locked constraints, what's in/out)
  • §1 Goals
  • §2 User journeys — Authoring (upload base → annotate → save template) + Generation (pick template → bind data → edit live → preview → export). Include the upc-commentary parallel journey as a §2.x to validate the abstractions.
  • §3 Module shape — package tree, the seams between engine/variables/importer/exporter, the wire contract between Go core and TS editor.
  • §4 Hard decisions table (m's picks + rationale, divergences from inventor recommendations).
  • §5 Data model deltas (templates table? variable_definitions? draft_session?).
  • §6 Migration plan from current internal/services/submission_*.go to the new package. Critical: this is a big-ish refactor of working code — migration plan must protect the in-flight Submission generator AND the recent fixes (last night's underscore fix, building-block model, click-to-jump data-var contract).
  • §7 Slice train (probably 5-8 slices: extract engine to pkg → extract vars to pkg → resolver interface → authoring page → generation refactor → upc-commentary adapter → format plugins).
  • §8 Out of scope (other formats deferred, multi-user editing on same draft, etc).

Report DESIGN READY FOR REVIEW and park persistent. Head gates the coder shift.

Out of scope for the PRD: implementation, mig SQL drafting, code. PRD only.

NOT atlas / cronus / edison (parked with framing bias on procedures). Fresh Opus inventor.

## Direction (m, 2026-05-29 12:32–13:55) > I think we should modularize our "doc generator" to improve it better. There are different aspects to it and we may want to use them for other projects as well. > > The result is clear: I want to be able to create and modify word documents, using variables inside the documents, "editing them live" and preview the results, export in the end. We should have all that modular to keep it clean. The editor is something else than the importing, exporting, variable exchange, data fetching etc.. > > upc-commentary / upc-kommentar would be one later use case. There we also need to work with changing templates and input text from our different format. > > Go module is good enough for now. > Doc/docx is the most pressing — we should support export to all in the long run. > > Currently I can't upload the base document to insert variables into to create a template — and then later I want to fill the template using data, modifying it manually where necessary, then exporting. > > I want to keep the workflow clear and I feel that doing it with modules makes sense to fix single parts and not have different codebases. ## Locked constraints (confirmed) - **Packaging**: one Go module (e.g. `pkg/docforge` — name bikeshed during PRD) with clean sub-packages. Same model as `pkg/litigationplanner` today. - **Two consumers in scope**: paliad (current submission generator) + upc-commentary (future, lives at `UPCommentary/upc-kommentar`). Design the resolver / importer abstractions with both in mind from day one. - **Two distinct user surfaces** (m's 2026-05-29 13:55 pick): - **Authoring**: upload a base .docx → place variable slots in it → save as a template. (This is the gap that doesn't work today.) - **Generation**: pick template → bind variables to data → **manual edit where needed** (live editor with preview) → export. - **Format priority**: .docx first (most pressing). Engine designed to be format-pluggable for .pdf / .html / .md later. - **Editor stays per-consumer**: shared engine + variable wire shape, different editor UI per surface. - **Variable resolver shape**: TBD (inventor judgment — m: "whatever works best"). Lean toward Go interface per namespace (`type VariableResolver interface { Resolve(key string) (string, bool) }`) for testability + clean separation. ## Today's state — what to audit - `internal/services/submission_*.go` (~12 files) — current home for vars / merge / md / render / draft / base / section / building_block. - `internal/handlers/submission_*.go` (~5 files) — HTTP surface for the editor. - `frontend/src/client/submission-draft.ts` + `submission-draft.tsx` — editor UI. - Recent wins to preserve through the refactor: tonight's submission-md placeholder-underscore fix (commit b78a984 → main 1b4b2e4), the placeholderRegex contract (submission_merge.go:95), the live-preview-with-click-to-jump (data-var contract), the building-block + section model. - Known pain: no upload-and-annotate template authoring flow exists; markdown→OOXML walker has had subtle bugs (last night's underscore-strip); variable definition is hardcoded per resolver function (`addProjectVars`/`addUserVars` etc) rather than a clean interface. ## Inventor task **Grill in prose FIRST** — 3-5 core-metaphor questions before any AskUserQuestion batch. atlas's #152 + edison's #153 debriefs both flagged: structured chips with the wrong core-metaphor lock costs a doc rewrite. Prose-grill candidates: - **Module boundary lines**: where exactly does `engine` end and `variables` begin? Do importers belong INSIDE engine (since they parse OOXML) or as a sibling? Tests of this seam: can a new format adapter (markdown importer, say) be added without touching engine internals? - **Editor abstraction**: is there ANY shared editor surface (a TS lib both consumers import), or are they fully separate UIs talking only to the engine via JSON? Cost of "share" vs "don't share" is real — paliad's section editor isn't upc-commentary's commentary editor. - **Authoring surface scope**: does the upload-and-annotate authoring UI use the same live editor as generation (just with placeholder-insertion mode toggled), or is it a separate annotation tool? Reuse vs purpose-built. - **Template versioning**: when m updates a template, what happens to existing drafts that use it? Snapshot-at-draft-create vs always-latest. Has migration implications. - **upc-commentary's input format**: what's the "different format" m mentioned? Markdown? Structured form? Pre-existing legal-commentary entry shape? The importer's pluggability hinges on this. Then structured AskUserQuestion batches (4-Q max per batch, ~12-16 total) on hard UX/data/migration decisions. **Draft** `docs/plans/prd-docforge-2026-05-29.md`: - §0 Premises (current state, locked constraints, what's in/out) - §1 Goals - §2 User journeys — Authoring (upload base → annotate → save template) + Generation (pick template → bind data → edit live → preview → export). Include the upc-commentary parallel journey as a §2.x to validate the abstractions. - §3 Module shape — package tree, the seams between engine/variables/importer/exporter, the wire contract between Go core and TS editor. - §4 Hard decisions table (m's picks + rationale, divergences from inventor recommendations). - §5 Data model deltas (templates table? variable_definitions? draft_session?). - §6 Migration plan from current `internal/services/submission_*.go` to the new package. **Critical**: this is a big-ish refactor of working code — migration plan must protect the in-flight Submission generator AND the recent fixes (last night's underscore fix, building-block model, click-to-jump data-var contract). - §7 Slice train (probably 5-8 slices: extract engine to pkg → extract vars to pkg → resolver interface → authoring page → generation refactor → upc-commentary adapter → format plugins). - §8 Out of scope (other formats deferred, multi-user editing on same draft, etc). **Report DESIGN READY FOR REVIEW** and park persistent. Head gates the coder shift. **Out of scope for the PRD**: implementation, mig SQL drafting, code. PRD only. NOT atlas / cronus / edison (parked with framing bias on procedures). Fresh Opus inventor.
mAi self-assigned this 2026-05-29 11:56:03 +00:00
Author
Collaborator

PRD ready for review — docforge

Design committed: docs/plans/prd-docforge-2026-05-29.md (0918049) on branch mai/leibniz/inventor-prd-docforge.

Status: DESIGN READY FOR REVIEW — awaiting head/m go-no-go on the coder shift.

Premise correction (verified against live repos)

  • UPCommentary/upc-kommentar is Bun/SvelteKit/TS, zero Go — it cannot import a Go pkg/. Per m: out of scope as a live consumer for now; docforge is a pure Go pkg (in-process like pkg/litigationplanner), interfaces sized so an HTTP veneer is addable later.
  • There is no submission-draft.tsx (single .ts bundle); lukasjarosch/go-docx is comment-only — the renderer is pure archive/zip OOXML, no third-party docx dep.

m's 13 decisions (5 prose-grill + 8 structured) — all captured in §4

  • Intermediate model yes, but lossless for our .docx → opaque carrier + neutral Document.
  • Authoring = WYSIWYG click-to-insert slots. Input: Markdown primary, foreign docx import later (lossy, flagged).
  • Storage Postgres bytea behind a TemplateStore interface (Supabase Storage = 1-impl swap). Snapshot at draft-create. Extract-in-place migration.
  • Name docforge. New generic tables (docforge owns none — litigationplanner pattern). Extract UI pkg now. Exporter interface now, docx-only impl.

8-slice train (§7)

extract docx engine → neutral model+binding → VariableResolver interface → template store+schema → UI pkg extraction → authoring page → generation on uploaded templates → markdown importer + exporter finalisation. Slices 1-3 are behavior-preserving refactors front-loaded under golden-export + preview-string checks that protect the b78a984 underscore fix, the placeholderRegex + data-var contracts, and the building-block/section model.

Coder gate is the head's to call.

## PRD ready for review — docforge Design committed: `docs/plans/prd-docforge-2026-05-29.md` ([0918049](https://mgit.msbls.de/m/paliad/commit/0918049)) on branch `mai/leibniz/inventor-prd-docforge`. **Status: DESIGN READY FOR REVIEW** — awaiting head/m go-no-go on the coder shift. ### Premise correction (verified against live repos) - `UPCommentary/upc-kommentar` is **Bun/SvelteKit/TS, zero Go** — it cannot import a Go `pkg/`. Per m: **out of scope as a live consumer for now**; docforge is a pure Go pkg (in-process like `pkg/litigationplanner`), interfaces sized so an HTTP veneer is addable later. - There is **no `submission-draft.tsx`** (single `.ts` bundle); `lukasjarosch/go-docx` is **comment-only** — the renderer is pure `archive/zip` OOXML, no third-party docx dep. ### m's 13 decisions (5 prose-grill + 8 structured) — all captured in §4 - Intermediate model **yes, but lossless for our .docx** → opaque carrier + neutral Document. - Authoring = **WYSIWYG click-to-insert** slots. Input: **Markdown primary**, foreign docx import later (lossy, flagged). - Storage **Postgres bytea** behind a `TemplateStore` interface (Supabase Storage = 1-impl swap). **Snapshot** at draft-create. **Extract-in-place** migration. - Name **docforge**. **New generic tables** (docforge owns none — litigationplanner pattern). **Extract UI pkg now**. **Exporter interface now, docx-only impl**. ### 8-slice train (§7) extract docx engine → neutral model+binding → `VariableResolver` interface → template store+schema → UI pkg extraction → authoring page → generation on uploaded templates → markdown importer + exporter finalisation. Slices 1-3 are behavior-preserving refactors front-loaded under golden-export + preview-string checks that protect the b78a984 underscore fix, the placeholderRegex + data-var contracts, and the building-block/section model. Coder gate is the head's to call.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: m/paliad#157
No description provided.