# aggregator-refactor — Phase 5a **Task:** `t-projax-5a-aggregator` **Status:** in progress **Date:** 2026-05-21 ## Why Five `collectXxx` functions in `web/` all open-code the same fan-out shape: take `[]*store.Item` → look up each item's `item_links` of a given `ref_type` → fan out across the matching links with a 4-worker pool → reduce to typed rows. Concretely today: - `web/dashboard.go:447 collectTasks` (CalDAV VTODOs) - `web/dashboard.go:585 collectIssues` (Gitea issues) - `web/dashboard.go:832 collectEvents` (CalDAV VEVENTs) - `web/timeline.go:539 collectTimelineTodos` (CalDAV VTODOs — overlaps collectTasks) - `web/timeline.go:624 collectTimelineEvents` (CalDAV VEVENTs — overlaps collectEvents) Plus `mcp/tools.go:19` declares `TimelineBuilder` so the MCP layer can call `*web.Server.BuildTimelinePayloadFromArgs`. That points the dependency arrow `mcp → web`, which is the wrong way round; both should depend on a shared aggregator. ## What ships A new package `internal/aggregate/` that concentrates fan-out across linked items, plus the lifted day-grouping helpers from `web/timeline.go`. After the four slices land: - Dashboard's three collect functions are 5–10 line shims over the aggregator. - Timeline's two collect functions are 5–10 line shims plus a call to `aggregate.BuildTimelineDays`. - `mcp.TimelineBuilder` is gone; `RegisterProjaxTools` takes a `*aggregate.Aggregator` directly. - Worker-pool body, link-fanout, per-source error logging, day grouping, sort-within-day, sticky-pill logic, far-future fade all live in one package instead of being duplicated across three. No behaviour change. All existing tests stay green at every slice boundary. ## Design (settled in the task brief) ### Package layout ``` internal/aggregate/ aggregator.go — Aggregator struct, constructor, the five methods + All rows.go — Row types + TimelineRow sum + Result envelope timeline_days.go — BuildTimelineDays + sort/label/duration helpers aggregator_test.go — fan-out + per-source error tests (stub clients) timeline_days_test.go — grouping/sort/sticky/fade tests ``` ### Dependencies (interfaces, not concrete clients) ```go type CalDAVClient interface { ListTodos(ctx, calendarURL) ([]caldav.Todo, error) ListEvents(ctx, calendarURL, opts) ([]caldav.Event, error) } type GiteaClient interface { ListIssues(ctx, owner, repo, opts) ([]gitea.Issue, error) } type LinkLister interface { LinksByType(ctx, itemID, refType) ([]*store.ItemLink, error) DatedLinksRange(ctx, from, to) ([]*store.ItemLinkWithItem, error) ItemsCreatedInRange(ctx, from, to) ([]*store.Item, error) } type IssueCache interface { Get(key) ([]gitea.Issue, bool) Set(key, []gitea.Issue) } ``` `*caldav.Client`, `*gitea.Client`, `*store.Store` already match by method set. The existing `web.issueCache` gains exported `Get`/`Set` aliases (it already has lower-case versions) so it satisfies `IssueCache` unchanged otherwise. ### Methods - `Todos(ctx, items, Window) []TodoRow` — empty Window = no narrowing (dashboard pattern); non-zero Window narrows by Due for open todos and LastModified (Due fallback) for done/cancelled (timeline pattern). - `Events(ctx, items, Window) []EventRow` — Window required (CalDAV REPORT needs a time-range filter). - `Issues(ctx, items) []IssueRow` — no window; upstream `updated_at` ordering carries the recency signal. - `Docs(ctx, items, Window) []DocRow` — wraps `DatedLinksRange`, filters to items in the caller's allow-set. - `Creations(ctx, items, Window) []CreationRow` — wraps `ItemsCreatedInRange`, filters to items in the allow-set. - `All(ctx, items, AllOpts) Result` — convenience for MCP timeline. ### Row types All row types embed their primitive (`caldav.Todo`, `caldav.Event`, `gitea.Issue`) so html/template's existing `.Todo.UID` / `.Event.Summary` field access keeps working via Go field promotion. Template diffs in Slice B/C stay minimal. ```go type TodoRow struct { Item *store.Item CalendarURL string caldav.Todo } type EventRow struct { Item *store.Item caldav.Event } type IssueRow struct { Item *store.Item Repo string gitea.Issue } type DocRow struct { Item *store.Item; Link *store.ItemLink } type CreationRow struct { Item *store.Item } ``` ### TimelineRow Pointer-tagged sum type lifted into the package. Templates and the sort/group helpers consume it. ```go type TimelineRow struct { Date time.Time Kind string // "todo" | "event" | "doc" | "creation" Item *store.Item ItemPath string Todo *TodoRow Event *EventRow Doc *DocRow Creation *CreationRow // Display-side fields the template references directly. Kept flat so // the existing template syntax doesn't change. CalendarURL string StartLabel string DurationHint string Link *store.ItemLink PER string FarFuture bool } ``` ### Day grouping `aggregate.BuildTimelineDays(rows []TimelineRow, opts BuildOpts) []TimelineDay` takes pre-built rows, groups by `YYYY-MM-DD`, sorts each day's rows (timed events → all-day → todos → docs → creations), and applies sticky-pill markers for today/tomorrow. `BuildOpts` carries `Now`, `Order ("asc"|"desc")`, optional `TodayKey`/`TomorrowKey` overrides for test determinism. ### Per-source error handling Preserved from today: each per-calendar / per-repo failure is logged at WARN and the affected job is dropped. The remaining rows are returned. Banner-surfacing for unreachable upstreams is out of scope for this refactor (parked under §"Future work" below). ### Worker pool Per-call pool with 4 workers — same as today across all five functions. Created and torn down per aggregation call. No shared instance. ## Slicing | Slice | What lands | Verification | |-------|-----------|-------------| | A | docs/plans/aggregator-refactor.md (this file) + `internal/aggregate/` package + unit tests. No web/mcp wiring yet. | `go build ./...` + `go test ./internal/aggregate/...` + `strings \| grep -c internal/aggregate` ≥ 1 | | B | `web/timeline.go` consumes the aggregator. `Server.aggregator` wired in `web.New`. Templates updated where the row type changes. | `go test ./web/... -run Timeline` green unmodified, `/timeline` renders, SHA on `/healthz` matches push. | | C | `web/dashboard.go` consumes the aggregator. Dashboard-specific bucketing/cap stays put. | `go test ./web/... -run Dashboard` green unmodified, `/dashboard` renders. | | D | `mcp.TimelineBuilder` deleted. `RegisterProjaxTools` takes `*aggregate.Aggregator`. `cmd/projax/main.go` updated. `BuildTimelinePayloadFromArgs` removed or inlined. | `go test ./mcp/... -run Timeline` green, live `/mcp/rpc timeline` returns valid payload. | Each slice ships behind its own commit + merge + deploy + verification triple (per CLAUDE.md: SHA on `/healthz` matches `git rev-parse HEAD`). ## Test plan Unit tests in `internal/aggregate/` use in-memory stub implementations of `CalDAVClient`, `GiteaClient`, `LinkLister`, `IssueCache`. The tests cover: - Empty `items` slice — every method returns an empty slice without touching the network stubs. - Items with no links of the relevant `ref_type` — same. - Items with one matching link — single fetch. - Items with multiple matching links across multiple items — fan-out hits each (verified by stub call counter). - Per-calendar error from the CalDAV stub — logged, surviving rows returned. - Per-repo error from the Gitea stub — same. - Issue cache hit path — second call doesn't hit the stub when the cache returns a value. - `BuildTimelineDays` ordering — desc default; asc when requested; day group counts; sticky pill for today/tomorrow; far-future fade. - `BuildTimelineDays` within-day sort — timed events before all-day, todos after events, docs after todos, creations last; ties broken by Summary / PER / Item.Slug. Integration coverage stays in `web/...` and `mcp/...` and continues to exercise the real wiring through the Slice B/C/D ports. ## MCP filter-parity note (post-slice-D, 2026-05-22) Slice D moved MCP item resolution from `web.TreeFilter` to `store.ListByFilters`. The dimensions that round-trip identically: - `tags` — AND-match, unchanged. - `q` — substring match, unchanged. - `kinds` — unchanged (drives `aggregate.AllOpts.Kinds`). - `from`/`to`/`order` — unchanged. - `has` — explicit in-memory narrow against `store.LinksByRefType` (caldav-list / gitea-repo only). - `include_excluded` — explicit in-memory filter against each item's `timeline_exclude` array. Narrowed dimensions in the MCP path (vs. web TreeFilter): - `status` — first value wins (single-value at the store layer). TreeFilter accepts multiple. Use case is rare — most calls default to `["active"]`. - `mgmt` — AND-match (item must carry every named mode). TreeFilter used OR semantics including a synthetic `"unmanaged"` matcher. Reachable workaround: omit `mgmt` and filter the returned items client-side. Not a regression worth fixing in 5a — every documented MCP call from m and from otto-PWA uses tag + default status. If the gap bites, the fix is to either teach `store.ListByFilters` to accept multi-value status / OR-mgmt, or to lift `TreeFilter` into a neutral package and call it from both web/ and mcp/. ## Future work (out of scope for 5a) - Banner-surfacing for upstream failures (calendar unreachable, repo renamed) — today's silent log+continue stays. Filing as a §8 design follow-up. - Shared worker-pool instance across calls — not warranted at m's scale; per-call pool is fine. - Dashboard cache shape refactor — that's Phase 5b (candidate 3). - Item-write validation module — Phase 5c (candidate 2). ## References - Task `t-projax-5a-aggregator` - Existing collect functions: `web/dashboard.go:447,585,832`, `web/timeline.go:539,624` - Wrong-way layering: `mcp/tools.go:19` (TimelineBuilder) - CLAUDE.md § "Post-deploy verification (mandatory)"