design(t-paliad-146): Paliadin — in-app AI buddy

Inventor design pass for the Paliadin: a Claude-backed conversational
assistant grounded in the user's own paliad data + paliad's static
reference (courts, glossary, deadline rules, Fristenrechner concept
tree). Long-lived in-process Go service that calls Anthropic's
Messages API directly with tool use; every tool is a thin shim over
an existing service (Dashboard / Project / Deadline / Appointment /
Court / Glossary / DeadlineRule). RLS / visibility inherited from
those services — Paliadin literally cannot see what the caller cannot.

Five coordinated sub-designs answer the issue's 20 open questions:
  A. LLM architecture + tool-use + prompts (§2)
  B. Data access + RLS + PII (§3)
  C. UX (§4)
  D. Token budget + cost + audit (§5)
  E. Phasing (§7)

Phase 1 v1: /paliadin full page + sidebar entry, SSE stream of
Anthropic, 7 read-only tools, session-only history, 30/hour user cap
+ 1000/hour global cap, audit row per turn (metadata only — no
transcript), 4k input + 2k output token caps, no avatar/mascot, no
proactive onboarding. Migration 057 introduces paliadin_turns +
paliadin_rate_limit. Single PR, ~3500-4500 LoC.

mlex / /lex-* reuse: shape (system-prompt voice, tool-catalog idea,
citation style) — NOT code. mLex is a workspace, not a Go/TS repo;
the /lex-* skills drive Claude against youpc's MCP and cannot be
embedded in a paliad service.

Premise verifications surfaced one CLAUDE.md doc-bug (the
ANTHROPIC_API_KEY "Reserved for Phase H — do not set" row needs to
flip in the implementation PR — Paliadin un-defers it).

12 open questions for m in §8.5 — Anthropic key choice (personal vs
HLC enterprise), default model (Sonnet vs Haiku), surface
(/paliadin page vs drawer), mascot phase, 2-PA sanity check before
locking scope, etc. Same adoption-risk concern that just parked
t-paliad-145 — Paliadin's edge over open-Claude-in-another-tab is
data grounding, which only works if v1 makes it visible (citation
chips + tool-call evidence + tagline).

STOP after design. Awaiting m go/no-go before coder shift.
This commit is contained in:
m
2026-05-07 20:45:31 +02:00
parent 99f08e3863
commit dc7c807725

View File

@@ -0,0 +1,773 @@
# Design: Paliadin — in-app AI buddy / pet (t-paliad-146)
**Status:** READY FOR REVIEW
**Author:** noether (inventor)
**Issue:** [m/paliad#9](https://mgit.msbls.de/m/paliad/issues/9)
**Date:** 2026-05-07
**Branch:** `mai/noether/inventor-paliadin-in-app`
---
## §0 TL;DR
A new conversational surface inside paliad: **Paliadin**, a Claudebacked assistant that answers questions grounded in the user's own paliad data and paliad's domain knowledge. The Paliadin is a longlived inprocess Go service, not a persession worker spawn — it talks to the Anthropic Messages API directly with **tool use**, where every tool is a thin shim over an existing paliad service (DashboardService, ProjectService, DeadlineService, CourtService, GlossaryService, DeadlineRuleService, AgendaService). RLS / visibility is enforced at the service layer, exactly as it is for the rest of the app, so Paliadin literally cannot see what the caller cannot see.
Phase 1 surface: **dedicated `/paliadin` page + a sidebar entry under "Übersicht"**, serverside SSE stream of Anthropic's response (same shape paliad's parked t145 chat design specced), sessiononly conversation (no DB persistence in v1), 7 readonly tools, ~30 turns/hour rate limit per user, hard token caps (4 k input + 2 k output per turn), perrequest audit row (no full transcript v1 — store a redacted hash + token counts + toolcall list).
**No avatar, no mascot SVG, no proactive onboarding popup in v1.** Just a clean chat panel with the name "Paliadin" in the header. Mascot, drawer mode, persistent threads, writetools, and youpc.org caselaw lookup all deferred to Phase 2/3.
**mlex / `/lex-*` reuse: pattern, not code.** mLex turns out to be a *workspace* (`extractions/`, `analysis/`, `docs/`) — there is no Go/TS code to fork. The `/lex-*` skills are Claude Code instruction docs that drive *Claude itself* against youpc's MCP tools; they cannot be embedded in a paliad Go service. What carries over is the **shape**: tool catalog (search → fetch → cite), systemprompt voice (precise, citationbacked, flag uncertainty honestly), and the "every legal claim needs a citation" guardrail. §2.4 maps the carryover precisely.
**Tradeoff flagged upfront (read §9.1 before approving):** the same adoptionrisk concern that just parked the localchat design (tpaliad145, today 17:03) applies here. Paliadin's edge over "open ChatGPT in another tab" is *only* that it sees the user's own data — and that edge collapses if v1 doesn't make the datagrounding visible (citation chips, toolcall evidence) and explicit ("Paliadin sees only YOUR projects"). Without those, Paliadin is just a worse Claude. With them, it's the only Claude that can answer "welche Frist ist als nächstes auf dem MüllerVerfahren?".
---
## §1 Premises verified live (2026-05-07)
Before designing on top, I checked each loadbearing claim against the running system rather than CLAUDE.md / memory.
| Claim | Source | Verification |
|---|---|---|
| **mLex is a workspace, not a code repo** | issue framing "mlex project we could partially reuse" | `~/dev/mLex/` contains only `extractions/`, `analysis/`, `docs/`, plus `CLAUDE.md` + `AGENTS.md`. No `*.go`, no `package.json`, no tools that aren't Claude skills. The "code" is the `/lex-*` skill family in `~/.claude/skills/`, which is instruction docs driving Claude against `mcp__youpc__*` MCP tools. **Carryover is shape (system prompt, tool catalog, citation style), not adapters.** |
| `/lex-*` skill family | brief reference | `~/.claude/skills/{lex-research,lex-extract,lex-classify,lex-classify-patent,mai-lexy}/SKILL.md`. All five inventoried in §2.4. |
| Paliad has no anthropic / claude code | CLAUDE.md `ANTHROPIC_API_KEY` "do not set" row | `grep -ri anthropic ~/dev/paliad/internal ~/dev/paliad/cmd` → only `internal/branding/firm.go` comment unrelated to AI. `go.mod` has no `anthropic-sdk-go` dep. **This task undefers the env var; CLAUDE.md row needs updating in the same PR.** |
| Paliad has no SSE pattern shipped | substrate scan | `grep -rn 'http.Flusher\|text/event-stream' internal/` returns only references inside the parked t145 chat design doc — no live code. We bring our own. |
| Paliad and youpc share the same physical Postgres | infra | Both run on `100.99.98.201:11833` (port 11833 = ydb). Paliad's schema is `paliad`; youpc's is `data`. **A future "search UPC case law" tool would be a sameDB crossschema SELECT, not an HTTP hop** — but Phase 1 still excludes caselaw lookup (see §3). |
| Visibility is enforced at service layer (not via SET LOCAL auth.uid) | code | `internal/services/visibility.go` defines `visibilityPredicate(alias)` + `visibilityPredicatePositional(alias, idx)`; every projectscoped query inlines it. Paliadin's tools call existing services, inheriting the predicate. |
| `paliad.can_see_project()` is the canonical visibility function in DB (RLS, t139) | t139 migration 055 | `internal/db/migrations/055_hierarchy_aggregation.up.sql:144` `CREATE OR REPLACE FUNCTION paliad.can_see_project(_project_id uuid)`. Same predicate echoed in `services/visibility.go`. |
| Migration tracker is at 56 (`056_user_views`) | t144 A1 | `paliad_schema_migrations` row. Next migration is **057**. (t145 was parked before its `057_chat` shipped, so 057 is open.) |
| tpaliad145 (local chat) was parked today 2026-05-07 17:03 | memory + commit log | Commit `99f08e3` "Merge: t-paliad-145 design doc only — local chat feature PARKED per m's call". The chat SSE substrate that would have been shared is **not** built — Paliadin builds its own minimal stream. |
| Sidebar bell pattern (`sidebar-inbox-badge`) is reusable for a chatstyle entry | t138 | `frontend/src/components/Sidebar.tsx``navItem(href, icon, i18nKey, label, currentPath, badgeID?)` already takes an optional badge id. The same plumbing fits a Paliadin entry. |
| Sidebar `ICON_SPARKLE` already exists | UI scan | `frontend/src/components/Sidebar.tsx` defines `ICON_SPARKLE` (a star/sparkle SVG). Free icon for the Paliadin nav item. |
| `auth.UserIDFromContext(r.Context())` is the standard handlerside user lookup | code | `internal/handlers/dashboard.go:31` is the canonical pattern. Paliadin handlers will use it. |
| `branding.Name` (default "HLC") is the firmname source | tpaliad065 | `internal/branding/firm.go` reads `FIRM_NAME` once at boot. Paliadin's system prompt + greeting must use `branding.Name`, never hardcode "HLC". |
| Single web replica on Dokploy today | `docker-compose.yml` | One `web` service. SSE state inprocess is fine v1; multireplica migration deferred along with chat. |
**Docvslive conflicts encountered (must be fixed in the implementation PR):**
1. **CLAUDE.md** still says `ANTHROPIC_API_KEY` is "Reserved for Phase H (AI FristExtraktion) which is deferred per m's 2026-04-16 decision. Do not set." Paliadin undefers it. The CLAUDE.md row needs to flip to "Required for Paliadin (readonly Claude assistant) — set on Dokploy."
2. The earlier "do not want anthropic API" decision (memory `b6a11b55…`, 2026-04-16) was specifically about *Frist extraction from documents*. Paliadin is a different surface (interactive readonly Q&A over alreadystructured data). It does not silently revive the parked extraction feature — tpaliad011 stays blocked unless m explicitly unparks it too.
---
## §2 Sub-design A — LLM architecture, prompt, tool use, mlex/lex reuse
Answers Q1, Q2, Q3, Q4, Q17, Q18.
### 2.1 LLM provider (Q1)
**Recommendation: Anthropic Claude, single provider, accessed directly via the Messages API. Lock to Claude in v1; abstract behind a onefunction interface so future portability is cheap.**
| Provider | v1? | Why |
|---|---|---|
| Anthropic Claude (Messages API + tool use) | ✅ | Matches m's "wire into my claude" framing. Tooluse shape is mature. Streaming via SSE is native. Paliad already has `ANTHROPIC_API_KEY` reserved. |
| Mixed (Claude reasoning + smaller routing model) | ❌ | Premature optimisation; for ~30 turns/hour/user we don't need the routing layer. Singlemodel latency is fine. |
| OpenAI / open weight | ❌ | No HLC compliance review for those vendors; m's Anthropic key is on file. |
**Model selection within Anthropic:** default to **Claude Sonnet 4.6** (fast, toolusecapable, cheap enough for chat use). Allow override via `PALIADIN_MODEL` env var so we can drop down to Haiku for cost or up to Opus for tricky onboarding sessions without redeploying.
**Wire shape:** one Go HTTP client (`internal/services/paliadin/anthropic.go`) that POSTs `/v1/messages` with `stream: true`. We do not adopt `github.com/anthropics/anthropic-sdk-go` in v1 — the API surface we use (one streaming POST + tooluse loop) is small enough that a handrolled client is shorter than wiring the SDK and safer than depending on a Go SDK that has historically broken on minor version bumps in mAi's experience. Keep the option open for Phase 2 if the tokenaccounting / structured tooluse helpers in the SDK become attractive.
```go
// internal/services/paliadin/anthropic.go
type AnthropicClient interface {
Stream(ctx context.Context, req MessagesRequest, w StreamWriter) (Usage, error)
}
```
The interface is the only swappoint. Switching providers later means a new implementation, not a rewrite.
### 2.2 System prompt + message shape (Q2)
**Recommendation: single `system` prompt with paliad context + tool definitions; one persistent prompt across pages (no perroute system prompts in v1).**
#### 2.2.1 System prompt (locked, v1)
The system prompt is computed at process start from `branding.Name`, the user's locale (DE/EN), the user's `display_name`, the current date, and the visibleproject count (a single count, not the project list — keeps the prompt small). Computed *per request*, not per process — but its template is a constant.
```
You are Paliadin, an AI assistant inside {{firm}}'s patent practice
platform "Paliad". You help {{display_name}} ({{office}}) answer
questions about their own work in Paliad and about UPC / EPO / DPMA
patent practice.
Today is {{today}}. The user's display language is {{language}}; reply
in {{language}} unless the user switches midconversation.
You have readonly access to the following tools:
- whats_on_my_plate — the user's dashboard (deadline / appointment / matter buckets)
- list_my_projects — every project the user can see
- get_project_detail — full detail of one project (deadlines, appointments, parties, partner units)
- search_my_deadlines — filter the user's deadlines by status / date / project
- list_my_appointments — the user's upcoming appointments (next 30 days by default)
- lookup_court — Paliad's catalog of patent courts (UPC LDs, German LGs/OLGs/BGH, EPO, DPMA, ...)
- lookup_glossary_term — Paliad's bilingual patent glossary
- lookup_deadline_rule — Paliad's Fristenrechner concept tree (named deadline rules + their triggers)
Hard rules:
1. Never invent facts. If a tool returns nothing, say so. Do not guess
case numbers, deadline dates, court names, or party names.
2. Every concrete factual claim about the user's work MUST come from a
tool call in the current conversation. Cite using "[#deadline-XXXX]",
"[#projekt-XXXX]", "[court: Munich LD]", "[glossary: Klageerwiderung]"
so the UI can render citation chips.
3. You cannot mutate any data. If the user asks you to change something,
explain that v1 is readonly and point them to the right page in
Paliad.
4. Visibility is enforced before tools return — if your tool call comes
back empty, the data either doesn't exist OR the user can't see it.
Never disclose the latter; just answer "I couldn't find anything
matching that".
5. You cannot answer questions about other users' projects, even if the
user names them.
6. Respect the user's role. If the user has global_role=standard, do not
speculate about adminonly functions.
Style:
- Direct, professional, slightly warm. Lawyeradjacent.
- Reply in Markdown. Use lists, code blocks, blockquotes.
- Cite specifically (case numbers, dates, court names) — never "around
the 14th".
- When uncertain, flag it. ("I don't see a deadline matching that
description on the projects you can access.")
- No emojis unless the user uses one first.
You are NOT:
- A codewriting assistant
- A replacement for legal advice
- A web search
```
This is ~250 input tokens — well under the budget.
#### 2.2.2 Permessage envelope
The browser POSTs to `/api/paliadin/turn` with `{ session_id, user_message, history }`, where `history` is the prior turns *in the current session only* (session = browser tab; localStorage backs it). The server prepends the system prompt and runs the tooluse loop.
#### 2.2.3 Tool use vs RAGonly (Q2 secondary)
**Tool use, not RAG.** RAG (vector search over chunks of paliad content) is the wrong shape for this surface — paliad data is highly structured, the most useful answers come from filtered SQL queries (e.g. "all deadlines on my projects with `status='pending'` and `due_date<=now()+7d`"), and a vector store would just paraphrase what an SQL query returns more accurately. Tools give the model the same query power the user has, with hard visibility gates. Phase 2 may add RAG over a small static corpus (HL Patents Style guide, Paliadin docs) if onboarding queries don't get good answers from glossary lookups alone.
### 2.3 Longlived service vs lexystyle worker spawn (Q4)
**Recommendation: longlived Go service (inprocess) — *not* a persession Claude Code worker.**
| Option | Latency to first token | Cost / turn | Operational shape |
|---|---|---|---|
| Inprocess Go service calling Anthropic API directly | < 1 s (just network + queueing) | Pay only for the model tokens we use | Single binary, single Postgres conn, scales with paliad |
| `mai hire paliadin` per session (Claude Code worker) | 515 s | Worker startup overhead × N concurrent sessions × Claude Code's own context overhead | Operational footprint of running a worker per active user dozens of tmux panes, tasks, reports |
The lexy / cassandra worker pattern works because it's *batch*: classify N judgments, emit JSON, exit. A chat surface needs subsecond response times across dozens of HLC users in parallel. A ClaudeCodepersession pattern would give each user their own Claude in the loop, with all the tooling and messagebus scaffolding that implies wrong scale of abstraction.
**That said, two things from the worker pattern do carry over:**
1. **Systemprompt voice.** The lexy / mai-lexy SKILL.md persona ("Sharp, analytical, direct. Cites provisions and case law naturally. Flags uncertainty honestly.") is the right voice for Paliadin. We borrow it see §2.2.1.
2. **Tool catalog shape.** The lex-research SKILL.md tool list (search fetch full text enrich analyse cite) maps cleanly onto Paliadin's read tools see §3.
### 2.4 mlex / `/lex-*` carryover map (Q3, Q18)
**Inventory result, with the shapevscode split called out for each:**
| Skill / asset | What it does | Carryover to Paliadin |
|---|---|---|
| `~/dev/mLex/` (workspace) | `extractions/` (percase JSON), `analysis/` (markdown reports), `docs/` (legal references), `extractions/queue.json` | **None as code.** Workspace artifacts are the *output* of the skills they don't give us anything embeddable. |
| `lex-research` skill | UPC case law search analysis report. Tool catalog: `mcp__supabase__execute_sql`, `mcp__youpc__*`, `mcp__youpc-memory__*`. Output format: structured markdown with citation tables. | **Voice + toolcatalog shape.** "Search enrich analyse cite" is the Paliadin flow. The skill's outputformat conventions (case number on first mention, division comparison tables) seed the system prompt's style guidance. |
| `lex-extract` skill | Read full judgment text structured holdings / principles / interpretations JSON. | **Not v1.** Phase 2 candidate iff Paliadin gets a `extract_judgment(node_id)` write tool orthogonal to readonly v1. |
| `lex-classify` skill | Classify judgments against a 47leaf taxonomy. | **Not v1.** Same as above writesurface, batchshaped, irrelevant to interactive Q&A. |
| `lex-classify-patent` skill | Classify patents into IPC technology sectors via Anthropic. | **Pattern reference only.** It's already an Anthropicbacked pipeline, so its prompt structure is a working example we can crib from for the systemprompt template but the actual classification target is paliadirrelevant. |
| `mai-lexy` skill | Lawyer persona that orchestrates the above. "Citationbacked, flags uncertainty." | **Voice template.** The persona text is the closest thing to a working Paliadin system prompt; §2.2.1 borrows directly from it. |
| `claude-api` skill | Anthropic SDK / Messages API patterns + prompt caching guidance. | **Implementation reference for the Go client + caching strategy.** §6.4 picks up its prompt caching guidance. |
**Antireuse:** the `mcp__youpc__*` MCP tools that `lex-research` uses are designed for an interactive Claude Code session. Paliadin's tools must instead be Go service calls same data shape, different transport. Don't try to embed an MCP client in a paliad Go process; rebuild the same SQL queries against the same Postgres directly.
### 2.5 Tool catalog v1 (Q17)
Seven readonly tools. Each is a thin Go shim around an existing service; each enforces visibility through that service's existing `visibilityPredicate`.
| Tool name | Backing service / method | Inputs | Output (truncated to fit budget) |
|---|---|---|---|
| `whats_on_my_plate` | `DashboardService.Get(userID)` | none | `{deadline_summary, appointment_summary, matter_summary, upcoming_deadlines[≤10], upcoming_appointments[≤10], recent_activity[≤10]}` |
| `list_my_projects` | `ProjectService.ListVisible(userID, filter)` | optional `{status, kind}` | `[{id, kind, label, status, parent_id, path}]` paged 25 |
| `get_project_detail` | `ProjectService.Get(userID, id) + DeadlineService.ListByProject + AppointmentService.ListByProject + PartyService.ListByProject + DerivationService.AttachedUnits` | `{project_id}` | `{project, deadlines[≤25], appointments[≤25], parties[≤10], partner_units[≤5]}` 503 if user can't see it (LLM gets a clean "not found", same response as truly missing) |
| `search_my_deadlines` | new helper on `DeadlineService` (reuses `visibilityPredicate`) | `{q?, status?, project_id?, due_after?, due_before?, limit≤25}` | `[{id, title, due_date, status, project_label, court}]` |
| `list_my_appointments` | new helper on `AppointmentService` | `{from, to, project_id?}` | `[{id, title, start_at, end_at, location, project_label}]` |
| `lookup_court` | `CourtService.Search(q)` (firmwide; no visibility filter courts are reference data) | `{q}` | `[{slug, name, country, kind, address, vacation_periods[≤4]}]` truncated 10 |
| `lookup_glossary_term` | static JSON loader (`internal/handlers/glossary.go` data) | `{q, lang?}` | `[{de, en, definition, category}]` top 5 |
| `lookup_deadline_rule` | `DeadlineRuleService.SearchConcept(q)` | `{q}` | `[{rule_code, concept_label, trigger_event, deadline_text, legal_source}]` top 5 |
**Bumped out of v1 (Phase 2 candidates):**
- `list_my_pending_approvals` (the inbox bell payload) useful but adds RLS surface; let v1 stabilise first.
- `search_youpc_case_law` m's framing example, but crossschema bigger blast radius. Phase 2 once Paliadin proves its weight on paliadinternal data.
- `search_my_audit_log` high signal but PII heavy.
- `compute_frist` would invoke the existing `DeadlineCalculator`. Useful but the user can already do this on `/tools/fristenrechner`; defer until we see queries that actually want it.
- All write tools (`create_deadline`, `attach_partner_unit`, etc.) Phase 3 minimum, with hard confirmation gate (see §6).
### 2.6 The tooluse loop (Q2 tertiary)
Standard Anthropic tooluse loop:
```
1. Build messages = [system, ...history, user_message]
2. POST /v1/messages with tools=[...catalog]
3. Stream assistant reply chunks → relay to client SSE
4. If stop_reason == "tool_use":
for each tool_use block:
execute tool(input) on the matching Go service
emit tool_result block back into messages
goto 2 (with the same stream/SSE connection)
5. If stop_reason == "end_turn": close stream
```
**Hard cap on the loop:** 5 toolcall rounds per turn. After 5 rounds without `end_turn`, forceclose with "Sorry, I got stuck try rephrasing." Hitting the cap is a UI red flag we want to see in audit (see §6.3).
---
## §3 Sub-design B — Data access, RLS, PII
Answers Q5, Q6, Q7.
### 3.1 Knowledge sources for v1 (Q5)
**Recommendation: paliadinternal data + paliad's static reference data ONLY. youpc.org case law deferred to Phase 2.**
| Source | v1 | Reason |
|---|---|---|
| **Peruser paliad data** (deadlines, appointments, projects, parties, partner units, attached units) | | The whole point of Paliadin. Visibility enforced via `visibilityPredicate` (every backing service already does this; tool inherits it). |
| **Static reference data** in paliad (court catalog t122, glossary, deadline rules, Fristenrechner concept tree) | | Firmwide, no peruser gating, low blast radius. |
| **UPC case law** (youpc Postgres `data.judgments`, `data.judgment_markdown_content`) | Phase 2 | Crossschema SELECT is technically trivial (same Postgres) but: (a) inflates the v1 surface; (b) brings in 1700+ judgments scaling RAG/fulltext question; (c) m's framing called out research as a *use case*, not a v1 musthave. Ship paliadinternal Q&A first; layer caselaw on once the substrate is proven. |
| **HL Patents Style guide / Paliad onboarding docs** | Phase 2 | No internal corpus exists yet; would need docsauthoring + indexing. The `lookup_glossary_term` tool already covers the most common onboarding question shape ("was bedeutet X?"). |
| **External web search** | | Out of scope; Paliadin is a *grounded* assistant, not a web surfer. m can use the regular Claude for that. |
**Ranking inside the v1 set (when Paliadin has to choose):**
1. Userdata tools first when the question references "my", "the case", "the deadline", or names a project / case number that resolves.
2. Static reference next when the question is conceptual ("what's a Klageerwiderung?", "which court is the Munich LD?").
3. Combine when both apply ("when is my Klageerwiderung due?" `lookup_deadline_rule` for the rule + `search_my_deadlines` for the user's instance).
The system prompt names tools in this priority order; the model's toolselection follows.
### 3.2 Auth / visibility boundary (Q6)
**The gate:** every backing service already runs `visibilityPredicate(alias)` against the caller's UUID. The Paliadin tool shim is a 5line wrapper that calls the service with `userID` derived from `auth.UserIDFromContext(r.Context())` at the SSE handler boundary. There is no servicerole escape the shim simply has no other UUID to pass in.
**Beltandbraces:** every tool result is inspected for `project_id` columns; for each distinct `project_id`, the shim asserts `paliad.can_see_project(_project_id)` returns `true`. (Defenceindepth: catches any future servicelayer regression where someone forgets the predicate. Costs one extra cheap function call per tool turn; cheap.)
**The "tell, don't disclose" rule (§2.2.1 hardrule 4):** if the user names a project they cannot see, the tool returns `{error: "not found"}` same response as a project that doesn't exist. The system prompt instructs the model to say "I couldn't find anything matching that" without distinguishing the two cases. This is the same rule the t144 ViewService already applies.
**Crossuser PII in tool outputs:** tool outputs may legitimately contain other users' display names (e.g. project teams, deadline assignees). These are visible to the caller through the regular UI already, so disclosing them through Paliadin is no worse. We do NOT redact them.
**Approval / partnerunit derivation:** `get_project_detail` returns the derived team (per t139 `DerivationService.AttachedUnits`). Same predicate as the rest of the app.
### 3.3 PII handling, retention, encryption (Q7)
**v1 stance: minimum viable persistence, maximum auditability of the access pattern.**
| Data | Stored where | Retention | Encryption | Notes |
|---|---|---|---|---|
| Conversation history (the actual messages) | **Browser localStorage only.** Cleared on browser data wipe / reloadwithfreshsession. | Session only | n/a | Phase 2: optin DB persistence with retention controls. |
| Perrequest audit row | New `paliad.paliadin_turns` table | Forever (matches auditlog pattern; softdelete only) | Atrest by Postgres / Supabase volume encryption | Stores: `turn_id, user_id, started_at, finished_at, model, input_tokens, output_tokens, tool_calls (jsonb of tool names + arg hashes — NOT arg values), prompt_hash (sha256 of redacted user message), error_code`. **No prompt body, no completion body.** |
| Toolcall inputs (e.g. project_id arguments) | Hashed (sha256) into the audit row's `tool_calls` jsonb | Forever | n/a | The hash is enough to detect "this user kept asking about project X" patterns without storing the readable id. |
| Anthropic API request/response bodies | **Not stored.** Streamed through the Go service straight to the SSE writer. | n/a | TLS in flight | Anthropic's own retention is governed by the org's API contract pulling Paliad onto an existing HLC enterprise key would inherit that. |
**Why this shape:**
- **Compliancelite v1.** HLC's compliance team has not yet weighed in on AImediated PII (memory says the Phase H decision was "we don't want anthropic API for a while"). Storing the full transcript opens a retention/disclosure question we don't need to answer to ship Paliadin's MVP. The auditmetadata row is enough to demonstrate: (a) who used it, (b) how often, (c) what tools they triggered, (d) cost.
- **Phase 2 transcript persistence** would add a `paliadin_messages` table (turn_id FK, role, content, redact_marks jsonb) and a peruser setting "keep my history". Default off.
- **Why no PII redaction in the user prompt?** v1 is optin (the user typed the prompt). Redacting client names / case numbers in the audit hash would defeat the point; we redact by *not storing the prompt*, only its hash.
**The Anthropic side:** if HLC's enterprise contract forbids vendorside retention, the Go client must set `metadata: {user_id: "<hash>"}` and ensure the API call is on an org with zeroretention guarantees. **Open question for m: which Anthropic key are we using — m's personal key (existing `ANTHROPIC_API_KEY` precedent in mAi/youpcms) or a new HLC enterprise key?** This is the single biggest compliance question; see §9.2.
---
## §4 Sub-design C — UX
Answers Q8, Q9, Q10, Q11, Q12.
### 4.1 Surface placement (Q8)
**Recommendation (counter to brief): start with a dedicated `/paliadin` fullpage route + a sidebar entry under the "Übersicht" group. Defer the rightdrawer to Phase 2.**
| Option | v1? | Why |
|---|---|---|
| **`/paliadin` full page** + sidebar entry | | Lowest CSS risk; mobileresponsive for free (paliad's existing breakpoints work); easy to test via Playwright; matches paliad's "every feature is a toplevel page" pattern; no zindex / overlay debugging. |
| Rightdrawer slideout from any page | Phase 2 | Pretty, matches m's "panel docked into UI" framing but adds: drawer toggle wiring on all 30 pages, scrolllock interaction, focus management, mobile smallscreen fallback. Not worth the v1 surface area. Phase 2 wraps the same `/paliadin` UI in a slideout container. |
| Floating bottomright bubble | | Clippy comparison is *visual*, not *positional*. A floating overlay on every page collides with the BottomNav on mobile (already 5/5 slots) and the inbox bell on desktop. |
| Pageembedded panel on `/paliadin` only | | This *is* the v1 recommendation, just framed differently. |
**Sidebar entry:**
```
Übersicht
Start
Agenda
Inbox 🛎
Paliadin ✨ ← new, ICON_SPARKLE
```
Group placement under Übersicht (not under Tools or Wissen) because Paliadin is conversation about *the user's work*, not a knowledge tool.
**Mobile:** Paliadin is reachable via the sidebar drawer (existing mobile pattern). No BottomNav slot those are full and the ranking (Start / Projekte / + / Agenda / Menü) is more important than a chat shortcut for v1.
### 4.2 Avatar / personality (Q9)
**Recommendation: no avatar SVG in v1. Just a chat panel with the name "Paliadin" in the header. Mascot is Phase 2.**
Why:
- Mascot design is a real design exercise (34 iterations to get something that doesn't read as kitsch in a law firm). Not inventor's call to bash one out in a v1 ship.
- The brand cue (limegreen `#c6f41c` accent) is enough to make Paliadin feel like part of paliad without a character.
- Paliadin's *personality* lives in the system prompt 2.2.1), not in pixels. Voice carries the buddy framing; mascot makes it visual but isn't loadbearing.
What we ship in v1 instead:
- Header: "✨ Paliadin" (sparkle icon + name) above the chat panel.
- Emptystate prompt: "Was kann ich für dich tun?" (DE) / "How can I help?" (EN).
- Oneline tagline under the header: "Ich kenne deine Akten und Paliads Wissensbasis." (DE) / "I know your matters and Paliad's knowledge base." (EN). This is the *only* v1 affordance that explicitly tells the user "I see your data" loadbearing for the differentiation argument in §09.1.
**Phase 2 mascot brief (for when m greenlights it):** small SVG, friendly, limegreen primary, no eyesdarting / animatedonidle (creepy), modular pose set so it can react to "thinking" / "found it" / "stuck" without being an MMORPG pet.
### 4.3 Onboarding hint (Q10)
**Recommendation: silentuntilinvoked. No proactive popup, no firstrun modal, no toast.**
Why:
- Paliad already has a polished onboarding flow (tpaliad034). Adding a Paliadin popup on top would be the kind of "surprise the user" affordance that erodes trust the first time it misfires.
- The emptystate inside `/paliadin` itself is the right onboarding surface: 3 starterprompt buttons rendered when the chat is empty.
**Three starter prompts (DE primary):**
1. "Was steht heute an?" triggers `whats_on_my_plate`
2. "Welche Fristen sind diese Woche fällig?" triggers `search_my_deadlines` with `due_before=now()+7d`
3. "Erkläre mir Klageerwiderung." triggers `lookup_glossary_term` + `lookup_deadline_rule`
EN equivalents: "What's on my plate?" / "Which deadlines are due this week?" / "Explain Klageerwiderung."
Picking one from the row sends it as if the user typed it. Keeps the surface zeroweight when ignored.
**Phase 2 candidate:** postonboarding email / inbox card "Paliadin ist live, frag ihn was deine Daten dir sagen." Driven by the existing reminder/email substrate. Out of v1 scope.
### 4.4 Action chips in responses (Q11)
**Recommendation: action chips parsed from a simple inline syntax in the model's reply, rendered clientside, NOT a tool the model invokes.**
Why simple syntax over a tool: tool invocations cost a roundtrip; we want the model to "suggest" an action without paying for an extra tool turn. The model emits a structured marker in its prose; the frontend client parses it and renders a chip below the bubble.
**Marker format:**
```
[#deadline-OPEN:c47bd2]
[#projekt-OPEN:slug-x]
[#frist-OPEN:c47bd2]
[#termin-OPEN:abc123]
[chip:nav:/projects/abc-123] (for arbitrary navigation)
[chip:filter:status=pending&due=this_week] (for parameterised inbox links)
```
The system prompt teaches the model to emit chips when navigation or filtering would help the user act on the answer. Each marker resolves to one chip, rendered as:
```
┌──────────────────────────────────────┐
│ Frist 16.05.2026 fällt morgen. │
│ [Frist öffnen] [Akte ansehen] │
└──────────────────────────────────────┘
```
**Client parser** (`frontend/src/client/paliadin.ts`): regex over the streamed text, replaces marker with a button. Buttons are real `<a>` elements (Cmdclick works, keyboard works), styled like the existing `.entity-table` row chips.
**Why not let the model embed full URLs?** Two reasons:
1. URLs change (we renamed `/akten` `/projekte` midproject). Markers are stable; we resolve them at render time.
2. Hallucinated URLs are real risk. If the model can only emit a marker tied to an id we *know* it just retrieved, the chip can't navigate to a fake page.
### 4.5 Streaming + interruption (Q12)
**Recommendation: SSE stream from `/api/paliadin/stream`, client EventSource, userinitiated abort via "Stop" button.**
#### 4.5.1 Stream shape
Mirrors Anthropic's native streaming events, adapted for our SSE consumer:
```
event: meta
data: {"turn_id":"01H…","model":"claude-sonnet-4-6"}
event: content_delta
data: {"text":"Auf der Akte Müller…"}
event: tool_call
data: {"name":"search_my_deadlines","args_hash":"…","status":"running"}
event: tool_result
data: {"name":"search_my_deadlines","status":"ok","summary":"3 results"}
event: content_delta
data: {"text":"… ist die Klageerwiderung am 16.05. fällig."}
event: chip
data: {"kind":"deadline","action":"open","id":"c47bd2"}
event: end
data: {"input_tokens":342,"output_tokens":88,"tool_calls":1}
# heartbeat every 25 s to keep Traefik from reaping
event: ping
data: {}
```
The `tool_call` / `tool_result` events are visible in the UI as small dim "ran search_my_deadlines (3 results)" lines under the bubble the **citation evidence** that distinguishes Paliadin from a generic chatbot. (Direct quote from the §0 framing: "the differentiation collapses if v1 doesn't make the datagrounding visible.")
#### 4.5.2 Interruption
- "Stop" button next to the input. Click `EventSource.close()` + `fetch('/api/paliadin/stream/{turn_id}/abort', {method:'POST'})`.
- Server abort closes the upstream Anthropic request via context cancellation.
- Stopped turns still write an audit row with `error_code='user_aborted'` so we see how often users hit it.
#### 4.5.3 Reconnect
Same LastEventID resume pattern the t145 chat design specced. Server keeps the inflight stream buffered for 30 s after disconnect; reconnect within that window replays missed events. After 30 s, the turn is considered done reconnect arrives at the start of a fresh session.
---
## §5 Sub-design D — Token budget, cost, audit
Answers Q13, Q14, Q15, Q16.
### 5.1 Perrequest token cap (Q13)
**Recommendation: `max_input_tokens=4000` (model's view of input including system + history + tool defs + user msg) and `max_tokens=2000` (model's max output) — same as brief. Hardfail above; softtruncate history below.**
Rationale:
- A typical paliad data tool result is < 500 tokens (truncated lists, capped at 25 rows). Even with system prompt (~250) + tool defs (~600) + 5 prior turns (~600 each on average) the input stays well under 4 k.
- If the conversation runs long (~8+ turns), the client/server softtruncates history (drops oldest user/assistant pairs first) before sending. The user sees a "Earlier in this conversation, we discussed X (truncated)" pseudosystem message. Cleaner than failing the turn.
- Hard cap at 6 k input tokens over that, refuse the turn with "Conversation too long, start a new one." Defends against jailbreak attempts that try to balloon the prompt.
**Cost math at Sonnet 4.6 perturn typical (3 k input, 1 k output):** ~$0.012/turn. At 30 turns/hour/user × 38 onboarded HLC users × 5 working hours/day = ~5 700 turns/day = **~$70/day worst case**. Realistic load is probably 10× lower. Phase 2: prompt caching 5.4) drops it further.
### 5.2 Conversation history persistence (Q14)
**Recommendation: sessiononly in v1. Persistent threads in Phase 2.**
| Option | v1? | Why |
|---|---|---|
| Sessiononly (browser localStorage, cleared on tab close + Sign Out) | | Zero schema. Zero retention question. Aligns with §3.3 "minimum viable persistence." Lets us ship paliadin without compliance review of stored transcripts. |
| Persistent threads (DBstored, named) | Phase 2 | Real schema (`paliadin_threads`, `paliadin_messages`), retention policy, crossdevice sync, "delete my history" UX, possibly optin toggle. None of which is needed to validate "is Paliadin actually useful". |
**Edge case: page reload during a conversation.** localStorage persists the history *for that browser tab*. Closing and reopening the tab restores. Closing the browser & reopening also restores. Signout clears. Multidevice = different histories. We're explicit about this in the panel header: "Conversation lives in this browser only" tooltip.
**Why opt for slightly worse UX over the easy schema work:** the tpaliad145 chat just got parked over an *adoption*risk concern, not a schema concern. Paliadin should ship the smallest possible footprint that proves usefulness. Persistent threads can be a "you asked for this" Phase 2.
### 5.3 Rate limit per user (Q15)
**Recommendation: 30 turns/hour/user (slightly tighter than the brief's 50). Plus a global ceiling of 1 000 turns/hour across the firm. Both configurable.**
Peruser 30/hour because:
- 30/hour one turn every two minutes during sustained use. That's heavy use. A reasonable user asks 35 questions in a session.
- Soft hint at 25 ("you've used 25 of 30 messages this hour"), hard block at 30 with retryafter.
- Lower than 50 to give us a safety margin for runaway cost in week 1; we can raise it once we see real usage.
Global 1 000/hour ceiling because:
- Global cap = circuit breaker against the long tail (a script that sends 1000 turns/hour from one user we missed in the peruser cap, or a developer bug).
- 1 000 turns × ~$0.012 = $12/hour worst case = $288/day. We tolerate that for a day; we'd notice and tune.
**Storage:** simple Postgres `paliad.paliadin_rate_limit` table with `(user_id, hour_bucket, turn_count)` upserted on every turn start. No Redis, no extra dependency. Fast at this scale.
**Admin override:** global_admin can lift their own cap (they typically test things). Surface this in the audit row, not in a CLI.
### 5.4 Audit + logging (Q16)
**Recommendation: every turn writes a metadataonly row to `paliad.paliadin_turns`. Full transcripts are NOT stored in v1. Toolcall args are hashed. Anthropic vendor side is governed by orglevel retention.**
#### 5.4.1 Schema (migration 057)
```sql
CREATE TABLE paliad.paliadin_turns (
turn_id uuid PRIMARY KEY,
user_id uuid NOT NULL REFERENCES paliad.users(id),
session_id text NOT NULL, -- browser session, opaque
started_at timestamptz NOT NULL DEFAULT now(),
finished_at timestamptz, -- NULL until endofturn
model text NOT NULL, -- e.g. 'claude-sonnet-4-6'
input_tokens int, -- from Anthropic usage block
output_tokens int,
tool_calls jsonb NOT NULL DEFAULT '[]', -- [{name, args_hash, status, latency_ms}]
prompt_hash text, -- sha256 of user_message after PII redaction (best effort)
response_hash text, -- sha256 of full response (citation only, not stored)
chip_count int NOT NULL DEFAULT 0,
error_code text, -- NULL on success; 'user_aborted', 'rate_limited', 'token_cap', 'tool_loop_cap', 'upstream_error'
estimated_cost_usd numeric(10, 6) -- for ops dashboards
);
CREATE INDEX paliadin_turns_user_started_idx
ON paliad.paliadin_turns(user_id, started_at DESC);
CREATE INDEX paliadin_turns_started_idx
ON paliad.paliadin_turns(started_at DESC);
ALTER TABLE paliad.paliadin_turns ENABLE ROW LEVEL SECURITY;
-- User sees their own; global_admin sees all.
CREATE POLICY paliadin_turns_select
ON paliad.paliadin_turns FOR SELECT
USING (
user_id = auth.uid()
OR EXISTS (SELECT 1 FROM paliad.users u
WHERE u.id = auth.uid() AND u.global_role = 'global_admin')
);
-- Service-role (paliad backend) writes; no userdirect INSERT.
-- (Paliad uses service-role conn, so policies on writes are inert,
-- but we still ENABLE RLS so future directauth callers are gated.)
```
Ratelimit table also lives in this migration:
```sql
CREATE TABLE paliad.paliadin_rate_limit (
user_id uuid NOT NULL REFERENCES paliad.users(id),
hour_bucket timestamptz NOT NULL,
turn_count int NOT NULL DEFAULT 0,
PRIMARY KEY (user_id, hour_bucket)
);
```
#### 5.4.2 What we DON'T store (v1)
- The user's actual prompt text. Only `prompt_hash`.
- The model's actual response text. Only `response_hash`.
- The tool inputs. Only `tool_calls[].args_hash`.
**Phase 2 transcript persistence** unlocks all three deliberately separate migration so the compliance review sits at *that* boundary.
#### 5.4.3 Vendor retention
The Anthropic side is governed by the orglevel contract. **Open question for m (§9.2):** does HLC have an enterprise / zeroretention agreement, or are we using m's personal key (matches existing `ANTHROPIC_API_KEY` precedent in mAi/youpcms)? The answer changes whether v1 needs a "data sent to Anthropic" disclosure on first use.
#### 5.4.4 Prompt caching (Phase 2)
The Anthropic API supports prompt caching for repeated system prompts + tool definitions. Our system prompt + 7 tool defs is ~850 tokens perfect cache target. Phase 2: enable cache_control on the system block; cuts input cost by ~90% on repeat turns within the 5minute cache window. Skip in v1 to keep the client minimal; pick up after the API surface stabilises.
---
## §6 Schema, endpoints, files
### 6.1 New endpoints
| Method | Path | Purpose | Auth |
|---|---|---|---|
| `POST` | `/api/paliadin/turn` | Initiate a turn assigns `turn_id`, opens SSE | loggedin (302 to /login otherwise) |
| `GET` | `/api/paliadin/stream/{turn_id}` | SSE stream of the turn's response (mostly invoked from the same `POST` to keep the connection live; separate GET supports reconnect) | loggedin |
| `POST` | `/api/paliadin/stream/{turn_id}/abort` | User cancels midturn | loggedin, must own the turn |
| `GET` | `/api/paliadin/limits` | Returns `{used_this_hour, hourly_cap, global_cap, global_used}` | loggedin |
| `GET` | `/paliadin` | The page shell (serverrenders the panel + initial empty state) | loggedin |
| `GET` | `/admin/paliadin` | Peruser usage / cost dashboard | global_admin |
The `POST /api/paliadin/turn` returns `{turn_id, sse_url}`; the client opens an `EventSource` on `sse_url`. Twostep keeps the POST cheap for telemetry / audit row creation, while the longlived stream lives on a GET that's safe to retry / resume.
### 6.2 New / extended services
| File | Status | Purpose |
|---|---|---|
| `internal/services/paliadin/service.go` | NEW | The orchestrator: run loop, history truncation, ratelimit check, auditrow writer |
| `internal/services/paliadin/anthropic.go` | NEW | Handrolled Messages API client (POST `/v1/messages`, stream parser) |
| `internal/services/paliadin/tools.go` | NEW | Tool catalog declaration + dispatch into existing services |
| `internal/services/paliadin/prompt.go` | NEW | System prompt template + perturn assembly |
| `internal/handlers/paliadin.go` | NEW | HTTP / SSE handlers |
| `internal/services/deadline_service.go` | extend | Add `SearchVisible(userID, q, status, projectID, dueAfter, dueBefore, limit)` (currently search is only on the global Fristenrechner matview) |
| `internal/services/appointment_service.go` | extend | Add `ListVisibleInWindow(userID, from, to, projectID)` |
| `internal/services/glossary_service.go` | NEW (or refactor of glossary handler data load) | A real service so the tool can call it; today it lives inline in the handler |
### 6.3 Frontend
| File | Status | Purpose |
|---|---|---|
| `frontend/src/paliadin.tsx` | NEW | Page shell |
| `frontend/src/client/paliadin.ts` | NEW | Chat panel, EventSource, history serialise to localStorage, chip parser, "Stop" button |
| `frontend/src/styles/global.css` | extend | New CSS section: `.paliadin-panel`, `.paliadin-bubble`, `.paliadin-bubble--user/--assistant/--tool`, `.paliadin-chip`, `.paliadin-input`, `.paliadin-meta` |
| `frontend/src/components/Sidebar.tsx` | extend | Add Paliadin navItem to the Übersicht group with `ICON_SPARKLE` |
| `frontend/src/i18n-keys.ts` | extend | ~25 new keys: `paliadin.title`, `paliadin.tagline`, `paliadin.starter.*`, `paliadin.empty`, `paliadin.input.placeholder`, `paliadin.stop`, `paliadin.rate_limited`, `paliadin.error.*` |
### 6.4 Migration 057
```
057_paliadin.up.sql:
- paliad.paliadin_turns (audit row, RLS, indexes)
- paliad.paliadin_rate_limit (counter table, PK on user+hour)
- GRANTs: service-role full, anon read disallowed by RLS
057_paliadin.down.sql: drop both tables.
```
### 6.5 Env vars (add to CLAUDE.md table)
| Variable | Required | Purpose |
|---|---|---|
| `ANTHROPIC_API_KEY` | for Paliadin | Anthropic Messages API key. **Replaces** the "do not set" row that referred to the parked Phase H. Without it, `/paliadin` returns 503 (server still boots; the rest of paliad keeps working). |
| `PALIADIN_MODEL` | optional (default `claude-sonnet-4-6`) | Override model for tuning / fallback to Haiku for cost or Opus for accuracy without redeploying. |
| `PALIADIN_HOURLY_CAP` | optional (default `30`) | Peruser turn cap per hour. |
| `PALIADIN_GLOBAL_HOURLY_CAP` | optional (default `1000`) | Firmwide turn cap per hour. |
| `PALIADIN_MAX_INPUT_TOKENS` | optional (default `4000`) | Soft cap; over this we truncate history. |
| `PALIADIN_MAX_OUTPUT_TOKENS` | optional (default `2000`) | Hard cap; passed straight to Anthropic. |
The Service must boot **without** `ANTHROPIC_API_KEY` (return 503 on `/paliadin*` routes; rest of paliad keeps working). Same pattern as `DATABASE_URL` and `CALDAV_ENCRYPTION_KEY`.
---
## §7 Sub-design E — Phasing
Answers Q19, Q20.
### 7.1 Phase 1 (v1) — confirmed scope
**Single coherent slice that proves the value proposition endtoend.**
| Item | In v1 |
|---|---|
| `/paliadin` page + sidebar entry under Übersicht | |
| Migration 057 (`paliadin_turns` + `paliadin_rate_limit`) | |
| Anthropic client (handrolled, streaming) | |
| 7 readonly tools | |
| System prompt with `branding.Name` + visibility rules | |
| SSE stream with `meta`/`content_delta`/`tool_call`/`tool_result`/`chip`/`end`/`ping` events | |
| Citation chips (parsed from inline markers) | |
| Rate limiting (peruser + global) | |
| Audit row per turn (metadata only, no transcript) | |
| Sessiononly history (browser localStorage) | |
| 3 starter prompts in DE+EN | |
| Token caps + soft history truncation | |
| `/admin/paliadin` cost dashboard (global_admin only) | |
| ~25 i18n keys (DE+EN) | |
| Mobile responsiveness (uses sidebar drawer like every other page) | |
| CLAUDE.md update flipping the `ANTHROPIC_API_KEY` row | |
**Estimated scope:** ~3 5004 500 LoC for the bundled v1 ship. Comparable to t144 (Custom Views) and t145's wouldhavebeen chat slice.
**Single PR or split?** Recommend **single PR** for v1. The Anthropic client + tool dispatch + handler + frontend panel are too tightly coupled to ship one without the others every component is on the critical path of "demonstrate Paliadin actually works". Splitting buys nothing reviewwise (no reviewer can validate "Anthropic client works" without "the tool dispatch that exercises it"). Use the same singlePR pattern as t144 A1+A2 in retrospect.
### 7.2 Phase 2 candidates (postv1, prioritised)
In rough order of value:
1. **Persistent threads** + peruser "keep my history" toggle. Adds `paliadin_threads` + `paliadin_messages` tables, retention policy, crossdevice sync. Compliance review attaches here, not to v1.
2. **Prompt caching** for system prompt + tool defs. ~90 % inputcost reduction on repeat turns. Pure serverside change.
3. **`search_youpc_case_law` tool.** Crossschema SELECT into `data.judgments` + `data.judgment_markdown_content`. Returns case number, division, date, headnote, top 3 holdings. The "research assistant" use case from m's framing.
4. **Rightdrawer mode.** Wrap the `/paliadin` panel in a slideout container; toggle on every page from a header button.
5. **Mascot SVG** + idle / thinking / foundit pose set. Real visual design pass.
6. **Onboarding tip** postonboarding inbox card or onetime toast on first dashboard visit after Paliadin lands.
7. **`list_my_pending_approvals` tool.** Wraps inbox bell payload.
8. **Voice input / output.** Web Speech API (paliad already has the substrate from the noVoicev1 tpaliad042 PWA).
### 7.3 Phase 3 candidates (validate first)
- **Write tools.** `create_deadline`, `create_appointment`, `attach_partner_unit`, `add_party`. Each behind a hard confirmation gate ("Paliadin will create a deadline 16.05. on project X confirm? [Yes / No]"). Auditrow marks these as mutating turns. Heavy compliance question; not Phase 2.
- **Perdeadline / pertermin microthreads.** Longlived perentity Q&A. Plumbing collision with the (parked) chat design reevaluate when chat unparks.
- **Proactive Paliadin.** Push tips when the user hits a known confused state ("You've been on /tools/fristenrechner for 8 minutes want me to walk you through it?"). Powerful, but creepy if poorly tuned.
- **Complianceaware redaction layer.** Strip client names from the prompt before it leaves the building, swap stable hashes back in clientside. Big project; only sensible if HLC compliance forbids vendorside PII.
---
## §8 Risks, mitigations, open questions
### 8.1 Adoption risk (the §0 callout, expanded)
**The risk:** Paliadin competes with three things HLC already has:
1. The user's own Claude / ChatGPT in another tab (for general patentpractice questions).
2. "Ask a colleague on Teams" (for paliadspecific questions about how to use the app).
3. Just clicking around the UI (for "what's on my plate today").
Paliadin's edge over (1) is data grounding. Edge over (2) is 24/7 + privacy. Edge over (3) is conversational discovery and answering oneshot naturallanguage queries that the structured UI doesn't expose.
**The risk realised:** if v1 doesn't make the datagrounding visible (citation chips, toolcall evidence under each bubble, the tagline "I see your data"), users default to ChatGPT for everything, and Paliadin becomes a ghost feature that ate 3 weeks of build. Same pattern that just parked tpaliad145.
**Mitigations baked into v1:**
- **Toolcall evidence visible** in every bubble. The user *sees* "ran search_my_deadlines (3 results)" instant differentiation from a generic chatbot.
- **Citation chips** make answers actionable, not just informative.
- **Tagline + empty state** explicitly say "I see your projects."
- **Three starter prompts** demonstrate the datagrounding immediately on first use.
**Mitigations m should consider before approving:**
- **Sanitycheck with two PA colleagues** before locking v1 scope. Same recommendation t145 got. If two PAs say "I'd just open Claude in another tab", the scope shifts toward making the datagrounding *more* prominent (e.g. ship "Paliadin sees only your data" as a persistent banner above the input, not a tooltip) before shipping at all.
- **Soft launch + telemetry.** v1's audit row gives us cheap measurement of: (a) total turns/day, (b) turns per user, (c) toolcall frequency (low = Paliadin is being used like ChatGPT, defeating the differentiation). Watch for two weeks; if toolcalls/turn < 1.5 average, the feature isn't doing what we shipped it for and Phase 2 priorities change.
### 8.2 Compliance / vendordata risk
**The risk:** sending client names + case content to Anthropic's API may not be sanctioned by HLC IT/compliance. The 20260416 "we don't want anthropic API for a while" decision (memory `b6a11b55…`) was about *Frist extraction from documents*; Paliadin is conversational, but the data envelope sent to Anthropic still contains PII whenever a tool returns a project name.
**Mitigations:**
- **HLC enterprise key** (vs m's personal key) if available gives orglevel retention + DPA coverage.
- **Zeroretention configuration** on the Anthropic call (`metadata: {user_id: "<hash>"}`, `cache_control` only on the system block, no `eval` enrolment).
- **Firstuse disclosure** in the panel: "Your messages and the data Paliadin retrieves on your behalf are sent to Anthropic. [Learn more]" loadbearing and required if the legal answer to §9.2 is "personal key, not enterprise".
- **Phase 2 hardening:** serverside redaction layer that swaps client names stable hashes before the API call, restores them clientside after. Big project; only sensible if compliance forbids vendorside PII.
### 8.3 Ratelimit / runawaycost risk
**The risk:** a user (or a bug) loops fast enough to drain budget before alarms fire.
**Mitigations:**
- Peruser 30/hour + global 1 000/hour caps 5.3). Both surfaced on `/admin/paliadin`.
- Perturn token cap 5.1).
- Perturn toolloop cap (≤ 5 rounds, §2.6).
- Audit row written *before* the upstream call so a ratelimitevading bug still leaves traces.
- `PALIADIN_HOURLY_CAP` / `PALIADIN_GLOBAL_HOURLY_CAP` are envvar configurable so we can tighten without a deploy.
### 8.4 Hallucination risk (model invents a deadline)
**The risk:** the model fabricates a deadline date / case number that doesn't exist in the user's data.
**Mitigations:**
- Hard rule in system prompt: "Every concrete factual claim about the user's work MUST come from a tool call in the current conversation."
- Citation markers tied to toolresult IDs only. Marker `#deadline-OPEN:c47bd2` resolves only if the id was returned by a real tool call this turn (frontend validates).
- Toolcallevidence visibility: the user can see that a tool ran and what it returned. Hallucination becomes obvious because the chip says "0 results" but the bubble claims a deadline.
- **Phase 2:** serverside posthoc validation that checks every cited id against the toolresult set; reject the message and retry if the model invented one.
### 8.5 Open questions for m (please decide before coder shift)
1. **QA:** Anthropic key m's personal key (existing pattern, fast) or HLC enterprise key (compliant, slower setup)? §3.3 + §8.2.
2. **QB:** Firstuse disclosure required? Yes if (QA = personal key) OR if compliance hasn't reviewed.
3. **QC:** Default model Sonnet 4.6 (recommendation) or Haiku 4.5 (cheaper)? Sonnet's tooluse quality is a meaningful step up; Haiku is fine for "what's on my plate" but weaker on multitool conversations.
4. **QD:** Sanitycheck with two PAs before locking scope? (Same recommendation that just parked t145.) If yes, this is the gate before any coder shift starts.
5. **QE:** Surface confirm `/paliadin` full page + sidebar entry, drawer deferred? Or push for drawer in v1?
6. **QF:** Mascot defer to Phase 2 (recommendation), or commission an inventorseparate design doc now so we can ship Paliadin with the visual identity?
7. **QG:** Starter prompts are the three I picked the right entry points, or are there better DEfirst oneliners that map to common HLC PA queries?
8. **QH:** Should Paliadin know `branding.Name` of the firm in its system prompt? Recommendation: yes (warmer voice, "in HLC's patent practice platform"). Risk: if `FIRM_NAME` rotates, prompt rotates with it; cache invalidates. Acceptable.
9. **QI:** Peruser 30/hour cap too low? Too high? Easy to tune later, but worth a sanity check.
10. **QJ:** youpc caselaw lookup tool keep it firmly in Phase 2, or fasttrack if HL research is highvalue?
11. **QK:** Audit row retention forever (current recommendation, matches auditlog pattern), or a fixed window (e.g. 90 days for cost rows, forever for compliancerelevant)?
12. **QL:** Default language autodetect from user `locale` (`paliad.users.locale` is a known pref), or follow the user's lastmessage language? Recommendation: start in user's locale; switch on first nonlocale user message.
---
## §9 What this design does NOT cover (deliberately)
- **The implementation.** This is a design pass; coder shift writes the code. No commits beyond this doc on the inventor branch.
- **Mascot visual design.** Phase 2; deserves its own design pass (and probably a designer's eye, not an inventor's).
- **HL Patents Style guide ingestion.** Out of v1; Phase 2 RAG candidate.
- **Voice input / TTS output.** Phase 2.
- **Multiuser collaboration (e.g. share a paliadin chat).** Out of scope; users have their own visibility, and joint chat is a chatfeature shape (parked).
- **Offline mode.** Paliadin is onlineonly by definition (it calls Anthropic). The PWA service worker should NOT cache `/paliadin` responses.
- **The renaming question.** "Paliadin" is m's name. Locked.
---
## §10 Recommended implementer
Same recommendation as t145: **noether, or a fresh coder Sonnet that has noether's substrate context.** NOT cronus per the standing memory directive on paliad.
Why:
- Substrate touchpoints are the same set the chat design covered: `visibilityPredicate`, `auth.UserIDFromContext`, sidebar entry pattern, migration tracker discipline, Dashboard/Agenda/Project/Deadline service interfaces. noether built half of these; the other half noether mapped during the chat design pass.
- Anthropic Go client is novel in paliad but is small and wellspecified by §6.2 + the `claude-api` skill.
- Frontend SSE consumer + chip parser is a onepage TS file.
---
## §11 End of design — STOP
This is the inventor deliverable. Per the role brief: **STOP after design. Do not begin implementation. Do not load `/mai-coder`.** Wait for m's explicit go/nogo on the questions in §8.5 before any coder shift starts.
The completion signal sent to head will use the literal phrase **"DESIGN READY FOR REVIEW"** so the head's gate fires.