# ImaGen architecture ImaGen is intentionally small. The framework owns plumbing; adapters own the upstream API. Each adapter only ever sees its own slice of `imagen.yaml`. ## Layers ``` ┌───────────────────────┐ │ cmd/imagen │ CLI dispatch (generate / worker / …) │ (or HTTP server) │ └──────────┬────────────┘ │ ┌──────────▼────────────┐ │ internal/prompt │ style preset → prompt suffix │ internal/output │ filename templating, sidecar │ internal/config │ YAML loader, validation │ internal/preview │ tmux-img window spawner │ internal/cloud │ Supabase Storage + imagen.images │ internal/usage │ mai.imagen_usage cost-tracking │ internal/worker │ imagen.jobs queue consumer └──────────┬────────────┘ │ ┌──────────▼────────────┐ │ internal/backend │ Backend interface + Registry └──────────┬────────────┘ │ ┌──────────▼────────────┐ │ adapters │ ComfyUI · Replicate · OpenAI · … │ (each one register- │ each registers a `type` name on │ s in init()) │ `backend.Default` at init time. └───────────────────────┘ ``` ## The Backend contract ```go type Request struct { Prompt string NegativePrompt string Width, Height int Steps int Seed int64 Style string BackendOpts map[string]any } type Result struct { ImageReader io.ReadCloser MimeType string Metadata map[string]any } type Backend interface { Name() string Generate(ctx context.Context, req Request) (*Result, error) } ``` Adapters translate `Request` into whatever the upstream expects. Fields they can't honour (e.g. `NegativePrompt` on DALL-E) are silently ignored. ## Registry `backend.Default` holds the process-wide name → constructor map. Each adapter calls `backend.Register("", NewX)` from its `init()`. The CLI imports `internal/backend` (which transitively triggers the mock's init) and any extra adapter packages. ## Config flow ``` imagen.yaml backends: flux-schnell-local: type: comfyui ──┐ base_url: http://mrock:8188 │ framework keeps `type`, model: flux1-schnell.safetensors │ hands the rest to the default_steps: 4 │ comfyui adapter as cfg map[string]any ──┘ ``` The framework never inspects fields below `type`. That's the adapter's contract with itself, expressed however the adapter wants (typed struct, map lookups, JSON tags — its call). ## Output ``` output: directory: ~/Pictures/imagen naming: "{date}-{slug}-{seed}.png" write_metadata_json: true ``` Placeholders: `{date}`, `{time}`, `{slug}` (lowercased prompt, alnum-only, truncated to 40 chars), `{seed}`, `{backend}`, `{ext}`. The sidecar JSON contains the prompt, backend instance name, seed, ISO timestamp, and the `Result.Metadata` map verbatim. ## Where adapters fail fast - Missing required field in their config block — return an error from the constructor; the CLI surfaces it as `imagen: backend "X": `. - Unset env-var for credentials — same. - Network errors during `Generate` — wrap and return; no retry policy yet (decide per-adapter, or move to a shared retry helper if a pattern emerges). ## Async write path: `imagen worker` + `imagen.jobs` `imagen generate` is the synchronous CLI. For web callers (flexsiebels' owner-mode UI) `cmd/imagen worker` runs as a daemon that consumes the `imagen.jobs` table. ``` flexsiebels POST imagen worker (mRiver, systemd) → INSERT INTO LISTEN imagen_jobs ◄── pg_notify trigger imagen.jobs(pending) claim row (UPDATE … RETURNING) dispatch through internal/backend write disk + cloud-sync via internal/cloud UPDATE imagen.jobs SET status='done', image_id=… ``` The queue table lives next to `imagen.images` in the same `imagen` schema. Owner-scoped RLS lets the flexsiebels user INSERT + read their own rows; the worker writes (status updates + image_id link) via service-role which bypasses RLS. A 5-second safety poll fires on every wake-up to cover dropped NOTIFY events and worker cold starts with a non-empty queue. See `docs/setup-worker-mriver.md` for the systemd installation. The worker reuses `internal/backend`, `internal/output`, and `internal/cloud` unchanged — it is purely an orchestration layer around the same pipeline `imagen generate` drives. ## Out of scope (today) - Image post-processing (cropping, watermarking). - Multi-image `n>1` per request — backends that support it can expose it via `BackendOpts`; the framework doesn't have a first-class field yet. - Job cancellation / kill switch — separate follow-up issue. - Concurrent workers / multi-host scale-out — `FOR UPDATE SKIP LOCKED` in the claim query makes it cheap to add, but a single worker is the v1 setup.