Files
ImaGen/docs/usage.md
mAi 8435817ce1 mAi: #10 - multi-model backend expansion (workflow templates + compare harness)
Path 1 architecture: one comfyui adapter, workflows as data.

- workflow_template.go: embed.FS + token substitution with type-preserving
  whole-value placeholders. ${prompt} → string, ${seed} → int64,
  ${cfg} → float64 — no JSON round-tripping. Partial matches ignored.
- comfyui.go: refactored to load workflow from embedded FS or filesystem
  path. Back-compat preserved: workflow: defaults to flux1-schnell.
- workflows/{flux1-schnell,flux2-klein,sd35-medium}.json — bundled
  templates. flux1-schnell migrated from hardcoded with identical node IDs.
- compare.go: new `imagen compare` subcommand. Sequential N-backend run
  (one GPU on mRock — parallel would OOM), per-backend PNG, sidecar JSON
  with per-model metadata + errors, composite contact sheet via Go image
  package (no ImageMagick dep).
- Sample config gains flux2-klein-local + sd35-medium-local instances.
- docs/backends.md: architecture rationale + per-model HF download paths
  + how to add a new bundled workflow + compare-harness reference.

Live smoke verified: compare mock + flux-schnell-local at 768×768 →
both PNGs written, sidecar JSON has workflow="flux1-schnell" + full
metadata, contact sheet renders. Worker contract (Request → Generate)
unchanged, so flexsiebels /imagine UI API surface preserved.

Tests: 11 existing comfyui + 6 new workflow_template + 5 new compare
tests, all green.

Adding a new model is now yaml + JSON, never Go.
2026-05-11 17:29:57 +02:00

6.6 KiB

Using imagen

Subcommands

imagen generate <prompt> [flags]    generate one image
imagen compare <prompt> --models a,b,c [flags]
                                    run one prompt across N backends + contact sheet
imagen worker [flags]               consume the imagen.jobs queue (daemon)
imagen backends                     list configured + registered backends
imagen config init                  print a sample imagen.yaml on stdout
imagen config validate              parse + validate the active config
imagen config path                  print the resolved config path
imagen serve [--addr :8080]         (stub) start the HTTP server
imagen usage [--since DATE]         show cost-tracking rows
imagen version                      print version

For the per-backend setup (FLUX.1, FLUX.2 [klein], SD3.5 medium, …) and the architecture rationale, see backends.md.

generate flags

Flag Default Notes
--backend default_backend from config Instance name from imagen.yaml
--size 1024x1024 WxH
--seed 0 (= backend default)
--steps 0 (= backend default)
--style empty One of imagen config init's style names
--negative empty Negative prompt (ignored by some adapters)
--output empty (= use naming template) Explicit path
--no-sidecar false Skip the JSON sidecar even if config enables it
--preview (auto) Force open a tmux preview window via tmux-img
--no-preview (auto) Suppress the preview window (use for batch / CI callers)
--no-cloud false Skip Supabase upload + imagen.images insert for this call
--config ~/.config/imagen.yaml Override config path

Preview window

After a successful generate, imagen optionally opens a sibling tmux window named img:<slug> running tmux-img --hold <path>. The new window is spawned in the background (tmux new-window -d) so the generating pane keeps focus and its terminal output.

Resolution order is config → $IMAGEN_PREVIEW → flag (later wins):

  • output.preview in imagen.yaml: auto (default) | on | off
  • IMAGEN_PREVIEW=auto|on|off overrides config
  • --preview / --no-preview override env

auto previews iff stdout is a TTY and $TMUX is set. on previews unconditionally and errors outside a tmux session. off never previews.

Preview failures are non-fatal — the image already wrote.

Examples

# Quick smoke test — mock backend ships in-tree
imagen generate "test" --backend mock --output /tmp/x.png

# Real generation, FLUX-schnell on mRock via ComfyUI
imagen generate "a wide editorial blog header about RAG systems" \
  --backend flux-schnell-local \
  --style blog-header \
  --size 1536x768

# Explicit seed for reproducibility
imagen generate "a cat in a fishbowl" --backend mock --seed 42 --output /tmp/cat.png

Config

A complete sample is in imagen config init. Adapters get only their own sub-block — see ../CLAUDE.md for the contract.

Naming template

output.naming placeholders:

Placeholder Replaced with
{date} 2026-05-08
{time} 143015 (no separators)
{slug} lowercased ASCII prompt, ≤ 40 chars
{seed} seed actually used
{backend} backend instance name
{ext} file extension matching Result.MimeType

Unknown placeholders are left literal.

Credentials

API-backed adapters read tokens from env vars referenced by the config (api_token_env, api_key_env). Never put a token in imagen.yaml.

export REPLICATE_API_TOKEN=...
imagen generate "a cat" --backend flux-dev-replicate

Cost-tracking (Replicate)

Successful generations through the Replicate adapter write one row to mai.imagen_usage on Supabase: backend, model, latency, per-image cost estimate, prompt sha256 hash (never the prompt itself), and the caller identity (resolved from MAI_FROM_ID or the tmux pane's @mai-name).

The writer is best-effort. If SUPABASE_URL / SUPABASE_SERVICE_KEY are unset, or the database write fails, the image still lands and the CLI prints a warning to stderr.

Inspect spend:

imagen usage                       # all rows, grouped by week + backend + model + caller
imagen usage --since 2026-05-01    # only rows on/after a UTC date
imagen usage --since 2026-05-01 --raw

Per-model rates live in internal/backend/replicate_pricing.go — they are snapshotted from https://replicate.com/pricing and refreshed on a quarterly cadence.

Cloud-sync (Supabase)

Successful generations also upload the PNG to the private Supabase Storage bucket imagen-generated (path: <YYYY-MM-DD>/<slug>-<seed>.png) and insert a row into imagen.images. The row carries the prompt, sha256-hashed prompt, backend, model, seed/steps/width/height, latency, cost estimate, the full local sidecar JSON, and an empty tags array ready for the flexsiebels viewer to fill in.

Configuration:

  • owner_user_id in imagen.yaml — m's auth.users.id. Empty disables inserts (the column is NOT NULL).
  • output.cloud_sync in imagen.yaml: auto (default — on iff SUPABASE creds + owner_user_id are set), on (errors if either is missing), off.
  • IMAGEN_CLOUD_SYNC=auto|on|off overrides config.
  • --no-cloud overrides everything for one call.

Reuses the same Supabase env (SUPABASE_URL + SUPABASE_SERVICE_KEY or MAI_SUPABASE_KEY) as cost-tracking. Service-role bypasses RLS for inserts; the owner_user_id = auth.uid() policy on the table gates the read path the flexsiebels viewer hits.

Failures (Storage 5xx, DB unreachable) emit imagen: cloud sync: <err> to stderr and the local PNG + sidecar stay put. Exit code is unchanged.