Extends the Composer's MD → OOXML walker per the design at
docs/design-submission-generator-v2-2026-05-26.md §12 Slice D from
Slice B's paragraphs + B/I baseline to the full rich-prose feature set:
headings 1-3, bullet + numbered lists, blockquote, inline hyperlinks.
MD walker (internal/services/submission_md.go, +320 / -75 LoC):
- RenderMarkdownToOOXMLWithStyles is the new Slice-D entry point;
RenderMarkdownToOOXML stays as a thin back-compat wrapper.
- splitMarkdownBlocks classifies every line into one of:
paragraph, heading_1/2/3, list_bullet, list_numbered, blockquote.
CommonMark-style 3-space indent tolerance; "N. " and "N) " for
numbered. Blank-line spacing semantics preserved from Slice B.
- renderBlockParagraph applies stylemap[blk.styleKey] (with
fall-back to stylemap["paragraph"]). List blocks emit visible
"• " / "N. " prefix runs so the structure surfaces even if Word
isn't configured with auto-list-numbering — lawyer can apply a
real Word list style post-export. Numbered-list ordinals reset
on every non-list block (so "1. A\nplain\n1. C" renders 1./1.,
not 1./2.).
- parseInlineRuns adds `[label](url)` recognition. Each link gets
routed through the optional HyperlinkAllocator; the walker emits
`<w:hyperlink r:id="{rId}">…runs…</w:hyperlink>` with the
"Hyperlink" character style on each child run. Nil allocator
falls back to plain-text label (URL drops, label survives).
Composer (internal/services/submission_compose.go, +130 / -10 LoC):
- composerLinkAllocator hands the walker fresh rIds (rIdComposer1,
rIdComposer2, …) outside the base's existing namespace; same URL
shared across multiple sections dedupes to one rId.
- patchDocumentXMLRels appends matching <Relationship Type="…/hyperlink"
Target="URL" TargetMode="External"/> entries to
word/_rels/document.xml.rels. Idempotent on rIds already present;
synthesizes a fresh rels part when missing (defensive for stripped
bases). Returns the patched parts slice (caller must overwrite
because append may grow the backing array — fixed in this slice).
- Compose now passes the full stylemap (paragraph + heading_1/2/3 +
list_bullet + list_numbered + blockquote) into the walker, not
just the paragraph-style entry.
Frontend (frontend/src/client/submission-draft.ts):
- Toolbar adds H1/H2/H3 buttons (formatBlock h1/h2/h3), bullet
list, numbered list, blockquote, and a link button that prompts
for a URL + wraps the selection via execCommand("createLink").
- domToMarkdown serializer extends to <h1>/<h2>/<h3>, <ul>/<ol>
with per-item ordinal counter for numbered lists, <blockquote>,
and <a href="…"> → `[label](url)`. Nested <li> handling sits in
the ul/ol branch.
Tests (internal/services/submission_md_test.go, internal/services/
submission_compose_test.go):
- TestRenderMarkdownToOOXML_Heading1 / _Heading2And3 — stylemap
applied.
- _BulletList / _NumberedList / _NumberedListResetsOnNonList —
prefixes + ordinal counter.
- _Blockquote — stylemap applied.
- _Hyperlink — allocator called, w:hyperlink rId wired, Hyperlink
character style on label runs.
- _HyperlinkNilAllocatorFallsBackToPlain — label survives, no
hyperlink tag emitted.
- TestDetectBlockMarker — 13 marker / non-marker cases.
- TestComposer_HeadingsAndLists — end-to-end through Compose with
a multi-construct draft; verifies stylemap presence + content +
ordinal prefixes.
- TestComposer_HyperlinkWiresRels — body has the right
<w:hyperlink r:id="rIdComposer{N}">, document.xml.rels has the
matching <Relationship> rows with External target mode.
- TestComposer_HyperlinkDedupesByURL — two `[label](url)` references
to the same URL share one rId; second allocation gets no new
Relationship row.
Build hygiene: go build/vet/test -short clean (all packages); bun run
build clean (2906 i18n keys).
NOT in scope (Slice D's brief was rich-prose + toolbar):
- Numbering.xml audit on bases — current approach emits visible
"• " / "N. " prefix runs without depending on numbering.xml. A
future slice can swap to `<w:numPr>` if firm-style auto-numbering
becomes a hard requirement.
- DOM-from-Markdown on initial editor paint — the editor still uses
textContent=md, so toolbar-applied formatting reverts to literal
Markdown text after autosave + repaint. Acceptable trade-off for
Slice D; a future polish could parse MD into the DOM on paint.
- Tables, images, footnotes (still design §13 out of scope).
Hard rules honoured:
- NO new migrations (Slice D is pure code).
- NO behavior change for pre-Composer drafts (gate on draft.BaseID
unchanged).
- {{rule.X}} aliases preserved (placeholders pass through the walker
verbatim, get substituted by the v1 SubmissionRenderer pass).
- Q2 ratification preserved (no building_block_id lineage).
- Q9 ratification preserved (4-tier BB visibility from Slice C).
t-paliad-316 Slice D
300 lines
10 KiB
Go
300 lines
10 KiB
Go
package services
|
|
|
|
// Unit tests for the Composer's Markdown → OOXML walker (t-paliad-313
|
|
// Slice B). Pure function; no DB dependency.
|
|
|
|
import (
|
|
"strings"
|
|
"testing"
|
|
)
|
|
|
|
func TestRenderMarkdownToOOXML_EmptyInput(t *testing.T) {
|
|
out := RenderMarkdownToOOXML("", "Normal")
|
|
if !strings.Contains(out, `<w:p>`) {
|
|
t.Errorf("empty input must still emit one <w:p>; got %q", out)
|
|
}
|
|
if !strings.Contains(out, `<w:pStyle w:val="Normal"/>`) {
|
|
t.Errorf("empty input must carry the paragraph style; got %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_SingleParagraph(t *testing.T) {
|
|
out := RenderMarkdownToOOXML("Hello world", "HLpat-Body-B0")
|
|
if !strings.Contains(out, `<w:pStyle w:val="HLpat-Body-B0"/>`) {
|
|
t.Errorf("paragraph missing stylemap entry: %q", out)
|
|
}
|
|
if !strings.Contains(out, "Hello world") {
|
|
t.Errorf("paragraph text missing: %q", out)
|
|
}
|
|
// Exactly one <w:p>.
|
|
if got := strings.Count(out, "<w:p>"); got != 1 {
|
|
t.Errorf("expected 1 <w:p>; got %d", got)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_TwoParagraphs(t *testing.T) {
|
|
out := RenderMarkdownToOOXML("first\n\nsecond", "Normal")
|
|
if got := strings.Count(out, "<w:p>"); got != 2 {
|
|
t.Errorf("expected 2 <w:p>; got %d, out=%q", got, out)
|
|
}
|
|
if !strings.Contains(out, "first") || !strings.Contains(out, "second") {
|
|
t.Errorf("paragraph text missing: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_BoldInline(t *testing.T) {
|
|
out := RenderMarkdownToOOXML("hello **bold** world", "")
|
|
if !strings.Contains(out, `<w:rPr><w:b/></w:rPr>`) {
|
|
t.Errorf("bold rPr missing: %q", out)
|
|
}
|
|
if !strings.Contains(out, ">bold<") {
|
|
t.Errorf("bold text payload missing: %q", out)
|
|
}
|
|
// The surrounding "hello " and " world" pieces are separate runs;
|
|
// the bold rPr should appear exactly once in this output.
|
|
if got := strings.Count(out, "<w:b/>"); got != 1 {
|
|
t.Errorf("expected exactly one <w:b/> tag; got %d in %q", got, out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_ItalicInline(t *testing.T) {
|
|
out := RenderMarkdownToOOXML("see *italic* here", "")
|
|
if !strings.Contains(out, `<w:rPr><w:i/></w:rPr>`) {
|
|
t.Errorf("italic rPr missing: %q", out)
|
|
}
|
|
if !strings.Contains(out, ">italic<") {
|
|
t.Errorf("italic text payload missing: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_BoldItalicCombo(t *testing.T) {
|
|
// Nested: ***both*** → entering both flags. The walker toggles each
|
|
// delimiter independently, so the resulting run carries both <w:b/>
|
|
// and <w:i/>.
|
|
out := RenderMarkdownToOOXML("***both***", "")
|
|
if !strings.Contains(out, `<w:b/>`) || !strings.Contains(out, `<w:i/>`) {
|
|
t.Errorf("expected both <w:b/> and <w:i/>; got %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_PlaceholdersPassThrough(t *testing.T) {
|
|
// Placeholders are sacred — the walker must preserve them verbatim
|
|
// so the v1 placeholder pass can substitute them later.
|
|
out := RenderMarkdownToOOXML("Sehr geehrter {{parties.claimant.0.name}}", "Normal")
|
|
if !strings.Contains(out, "{{parties.claimant.0.name}}") {
|
|
t.Errorf("placeholder corrupted: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_XMLEscape(t *testing.T) {
|
|
out := RenderMarkdownToOOXML("a & b < c > d", "")
|
|
if strings.Contains(out, " & ") {
|
|
t.Errorf("unescaped & survived: %q", out)
|
|
}
|
|
if !strings.Contains(out, "&") || !strings.Contains(out, "<") || !strings.Contains(out, ">") {
|
|
t.Errorf("expected escaped entities; got %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_BlankLinesPreserveSpacing(t *testing.T) {
|
|
// Two blank lines between paragraphs → one empty paragraph in
|
|
// between, preserving the lawyer's intentional whitespace.
|
|
out := RenderMarkdownToOOXML("first\n\n\nsecond", "Normal")
|
|
if got := strings.Count(out, "<w:p>"); got != 3 {
|
|
t.Errorf("expected 3 <w:p> (first + blank + second); got %d in %q", got, out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_CRLFNormalisation(t *testing.T) {
|
|
out := RenderMarkdownToOOXML("first\r\n\r\nsecond", "")
|
|
if got := strings.Count(out, "<w:p>"); got != 2 {
|
|
t.Errorf("CRLF input should produce 2 paragraphs; got %d in %q", got, out)
|
|
}
|
|
}
|
|
|
|
func TestParseInlineSpans_Plain(t *testing.T) {
|
|
spans := parseInlineSpans("hello world")
|
|
if len(spans) != 1 || spans[0].Bold || spans[0].Italic || spans[0].Text != "hello world" {
|
|
t.Errorf("expected single plain span; got %+v", spans)
|
|
}
|
|
}
|
|
|
|
func TestParseInlineSpans_UnderscoreItalic(t *testing.T) {
|
|
spans := parseInlineSpans("_emph_")
|
|
var italicHits int
|
|
for _, s := range spans {
|
|
if s.Italic && s.Text == "emph" {
|
|
italicHits++
|
|
}
|
|
}
|
|
if italicHits != 1 {
|
|
t.Errorf("expected one italic 'emph' span; got %+v", spans)
|
|
}
|
|
}
|
|
|
|
func TestParseInlineSpans_UnderscoreBold(t *testing.T) {
|
|
spans := parseInlineSpans("__strong__")
|
|
var boldHits int
|
|
for _, s := range spans {
|
|
if s.Bold && s.Text == "strong" {
|
|
boldHits++
|
|
}
|
|
}
|
|
if boldHits != 1 {
|
|
t.Errorf("expected one bold 'strong' span; got %+v", spans)
|
|
}
|
|
}
|
|
|
|
// ─────────────────────────────────────────────────────────────────────
|
|
// Slice D — rich-prose constructs
|
|
// ─────────────────────────────────────────────────────────────────────
|
|
|
|
func slicedStylemap() map[string]string {
|
|
return map[string]string{
|
|
"paragraph": "Body",
|
|
"heading_1": "H1",
|
|
"heading_2": "H2",
|
|
"heading_3": "H3",
|
|
"list_bullet": "ListBullet",
|
|
"list_numbered": "ListNumber",
|
|
"blockquote": "Quote",
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_Heading1(t *testing.T) {
|
|
out := RenderMarkdownToOOXMLWithStyles("# A heading", slicedStylemap(), nil)
|
|
if !strings.Contains(out, `<w:pStyle w:val="H1"/>`) {
|
|
t.Errorf("heading_1 missing H1 style: %q", out)
|
|
}
|
|
if !strings.Contains(out, "A heading") {
|
|
t.Errorf("heading text missing: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_Heading2And3(t *testing.T) {
|
|
out := RenderMarkdownToOOXMLWithStyles("## H2 line\n### H3 line", slicedStylemap(), nil)
|
|
if !strings.Contains(out, `<w:pStyle w:val="H2"/>`) || !strings.Contains(out, "H2 line") {
|
|
t.Errorf("h2 not rendered: %q", out)
|
|
}
|
|
if !strings.Contains(out, `<w:pStyle w:val="H3"/>`) || !strings.Contains(out, "H3 line") {
|
|
t.Errorf("h3 not rendered: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_BulletList(t *testing.T) {
|
|
out := RenderMarkdownToOOXMLWithStyles("- first\n- second\n* third", slicedStylemap(), nil)
|
|
if !strings.Contains(out, `<w:pStyle w:val="ListBullet"/>`) {
|
|
t.Errorf("bullet stylemap not applied: %q", out)
|
|
}
|
|
if strings.Count(out, "• ") != 3 {
|
|
t.Errorf("expected 3 bullet prefixes; got %d in %q", strings.Count(out, "• "), out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_NumberedList(t *testing.T) {
|
|
out := RenderMarkdownToOOXMLWithStyles("1. first\n2. second\n3. third", slicedStylemap(), nil)
|
|
if !strings.Contains(out, `<w:pStyle w:val="ListNumber"/>`) {
|
|
t.Errorf("numbered stylemap not applied: %q", out)
|
|
}
|
|
for _, want := range []string{"1. ", "2. ", "3. "} {
|
|
if !strings.Contains(out, want) {
|
|
t.Errorf("missing ordinal prefix %q in %q", want, out)
|
|
}
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_NumberedListResetsOnNonList(t *testing.T) {
|
|
// "1. A\n2. B\nplain\n1. C" → 1. A, 2. B, plain para, 1. C
|
|
out := RenderMarkdownToOOXMLWithStyles("1. A\n2. B\nplain\n1. C", slicedStylemap(), nil)
|
|
// The plain "plain" line breaks the list, so the next numbered
|
|
// item restarts at 1.
|
|
idxA := strings.Index(out, "1. ")
|
|
if idxA < 0 {
|
|
t.Fatalf("first 1. missing: %q", out)
|
|
}
|
|
idxB := strings.Index(out, "2. ")
|
|
if idxB < 0 || idxB <= idxA {
|
|
t.Fatalf("2. not after 1.: idxA=%d idxB=%d", idxA, idxB)
|
|
}
|
|
rest := out[idxB+1:]
|
|
idxC := strings.Index(rest, "1. ")
|
|
if idxC < 0 {
|
|
t.Errorf("numbered counter didn't reset on non-list block: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_Blockquote(t *testing.T) {
|
|
out := RenderMarkdownToOOXMLWithStyles("> the quoted text", slicedStylemap(), nil)
|
|
if !strings.Contains(out, `<w:pStyle w:val="Quote"/>`) {
|
|
t.Errorf("blockquote stylemap not applied: %q", out)
|
|
}
|
|
if !strings.Contains(out, "the quoted text") {
|
|
t.Errorf("blockquote text missing: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_Hyperlink(t *testing.T) {
|
|
allocated := map[string]string{}
|
|
alloc := func(url string) string {
|
|
rid := "rIdComposer" + url
|
|
allocated[url] = rid
|
|
return rid
|
|
}
|
|
out := RenderMarkdownToOOXMLWithStyles("See [Bundesgerichtshof](https://bgh.bund.de) for details.", slicedStylemap(), alloc)
|
|
if _, ok := allocated["https://bgh.bund.de"]; !ok {
|
|
t.Errorf("allocator never called for URL: %q", out)
|
|
}
|
|
if !strings.Contains(out, `<w:hyperlink r:id="rIdComposerhttps://bgh.bund.de">`) {
|
|
t.Errorf("hyperlink tag missing or wrong rid: %q", out)
|
|
}
|
|
if !strings.Contains(out, "Bundesgerichtshof") {
|
|
t.Errorf("link label missing: %q", out)
|
|
}
|
|
if !strings.Contains(out, `<w:rStyle w:val="Hyperlink"/>`) {
|
|
t.Errorf("hyperlink character style missing: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestRenderMarkdownToOOXML_HyperlinkNilAllocatorFallsBackToPlain(t *testing.T) {
|
|
out := RenderMarkdownToOOXMLWithStyles("See [BGH](https://bgh.bund.de) here.", slicedStylemap(), nil)
|
|
// Without an allocator, the label still renders as plain text.
|
|
if !strings.Contains(out, "BGH") {
|
|
t.Errorf("label dropped: %q", out)
|
|
}
|
|
if strings.Contains(out, "<w:hyperlink") {
|
|
t.Errorf("hyperlink emitted without allocator: %q", out)
|
|
}
|
|
}
|
|
|
|
func TestDetectBlockMarker(t *testing.T) {
|
|
cases := []struct {
|
|
in string
|
|
kind string
|
|
want string
|
|
ok bool
|
|
}{
|
|
{"# A", "heading_1", "A", true},
|
|
{"## B", "heading_2", "B", true},
|
|
{"### C", "heading_3", "C", true},
|
|
{" # indented", "heading_1", "indented", true}, // up to 3 spaces tolerated
|
|
{" # too-deep", "", "", false}, // 4 spaces → not a heading
|
|
{"- bullet", "list_bullet", "bullet", true},
|
|
{"* star", "list_bullet", "star", true},
|
|
{"1. one", "list_numbered", "one", true},
|
|
{"42. forty-two", "list_numbered", "forty-two", true},
|
|
{"1) paren", "list_numbered", "paren", true},
|
|
{"1.no-space", "", "", false}, // ordinal needs trailing space
|
|
{"> quote", "blockquote", "quote", true},
|
|
{"plain", "", "", false},
|
|
{"#nospace", "", "", false}, // heading needs space after hash
|
|
}
|
|
for _, tc := range cases {
|
|
t.Run(tc.in, func(t *testing.T) {
|
|
kind, payload, ok := detectBlockMarker(tc.in)
|
|
if ok != tc.ok || kind != tc.kind || payload != tc.want {
|
|
t.Errorf("detectBlockMarker(%q) = (%q,%q,%v); want (%q,%q,%v)", tc.in, kind, payload, ok, tc.kind, tc.want, tc.ok)
|
|
}
|
|
})
|
|
}
|
|
}
|