Files
ImaGen/CLAUDE.md
mAi 2758c5a500 mAi: #8 - imagen.jobs queue + worker subcommand (flexsiebels write path)
Async write path for the flexsiebels owner-mode UI: flexsiebels INSERTs into
imagen.jobs, the worker on mRiver claims pending rows via LISTEN/NOTIFY +
5s safety poll, runs the same generate pipeline imagen generate uses, and
writes the result through internal/cloud into imagen.images.

- Schema migration imagen_jobs_init: table + status CHECK + two indexes +
  owner-scoped RLS + grants + AFTER INSERT trigger publishing on the
  imagen_jobs channel via pg_notify.
- internal/worker: DB-agnostic loop over a Queue interface. Drains the
  whole pending backlog on each wake. Job-scoped contexts are derived
  from Background so SIGTERM lets the in-flight generation finish (no
  half-state). ResetStaleRunning at startup unsticks rows left over from
  a previous crash. Eight unit tests cover the done / failed / missing-id /
  drain / NOTIFY-wake / shutdown / transient-error paths against a fake
  queue (no real Postgres in CI).
- cmd/imagen/worker.go: pgx-backed Queue (one dedicated conn for LISTEN +
  UPDATE), plus the workerPipeline that reuses buildBackend +
  attachUsageSink + prompt.Apply + buildWriter + maybeCloudSync. The
  per-job owner_user_id overrides the env-level fallback so each row in
  imagen.images is attributed correctly.
- maybeCloudSync now returns (*cloud.SyncResult, error) so the worker can
  link imagen.jobs.image_id to the inserted imagen.images row. The CLI
  generate path keeps printing its stderr summary unchanged.
- scripts/imagen-worker.service + .env.example for the systemd --user unit
  on mRiver. EnvironmentFile lives in ~/.dotfiles and is never committed.
- docs/setup-worker-mriver.md walks through installation + the spec's
  SQL-INSERT smoke; docs/architecture.md grows an "async write path"
  section.
- worker_integration_test.go (env-guarded by IMAGEN_WORKER_INTEGRATION=1)
  drives one real job through the full pipeline against msupabase using
  the mock backend, then verifies imagen.images + Storage object landed
  and the row flipped to done with image_id linked. Verified end-to-end:
  pickup latency ~7ms, total 74ms, failure path captures error text.
2026-05-11 10:23:33 +02:00

4.8 KiB

ImaGen — Project Instructions

ImaGen is a model-agnostic image-generation framework. It has a single opinionated CLI (imagen) that dispatches to whichever backend the user configured — local FLUX on mRock via ComfyUI today, Replicate or DALL-E tomorrow, something else next year. The framework owns plumbing (config, output, naming, sidecars, prompt enrichment); each adapter owns the schema and lifecycle of its own block in ~/.config/imagen.yaml.

Architecture

cmd/imagen/                CLI shell — generate, worker, backends, config, serve
internal/backend/          Backend interface + Registry + Mock reference impl
internal/prompt/           Style preset registry (embedded styles.yaml)
internal/output/           Filename templating, image writer, JSON sidecar
internal/config/           YAML loader, validation, sample generator
internal/cloud/            Supabase Storage + imagen.images writer
internal/usage/            mai.imagen_usage cost-tracking sink
internal/worker/           imagen.jobs queue consumer (DB-agnostic via Queue interface)
internal/server/           HTTP stub (not implemented yet — follow-up issue)
scripts/                   imagen-worker.service + env template, ComfyUI scripts
docs/                      architecture.md, usage.md, setup-worker-mriver.md

Data flow for imagen generate:

  1. Parse flags, load config (internal/config).
  2. Resolve the requested instance name to a config block, then the block's type to a registered constructor in backend.Default.
  3. Apply style preset (internal/prompt) to the prompt.
  4. Call backend.Generate(ctx, Request). The adapter returns a *Result with an image stream + metadata.
  5. Stream to disk via internal/output. If write_metadata_json is on, a sidecar <image>.json is written next to it.

Backend contract

type Backend interface {
    Name() string
    Generate(ctx context.Context, req Request) (*Result, error)
}

Request carries the cross-backend fields (prompt, negative, size, steps, seed, style preset, free-form BackendOpts). Result returns the image bytes via an io.ReadCloser, the MIME type, and a metadata map (model name, seed actually used, latency, cost-estimate, …).

Adding a new adapter

  1. Create internal/backend/<adapter>.go (e.g. comfyui.go). Define a struct that holds whatever the adapter needs (HTTP client, model id, token).
  2. Add a constructor func New<Adapter>(name string, cfg map[string]any) (Backend, error). Read fields from cfg — that map is the adapter's own block from imagen.yaml minus the type: key. Resolve secrets from env vars (api_token_env, api_key_env) — never accept tokens inline.
  3. Implement Name() (return the user-facing instance name) and Generate(ctx, Request).
  4. In init() call Register("<type-name>", New<Adapter>).
  5. Anonymous-import the package from cmd/imagen/main.go if it lives in a separate package, so the init() runs.
  6. Add a smoke test under internal/backend/<adapter>_test.go. Network tests should be guarded by testing.Short() or an env var.

Config

~/.config/imagen.yaml (override with --config). Top-level keys:

  • default_backend — instance name used when --backend is omitted.
  • output.directory / output.naming / output.write_metadata_json.
  • backends: — map of instance-name → {type, …adapter-specific…}.

The framework parses type and stuffs the rest into BackendSpec.Raw. The adapter is free to define any schema it likes inside its block.

Credentials

Never hardcode. Always reference env-var names from the config:

flux-dev-replicate:
  type: replicate
  api_token_env: REPLICATE_API_TOKEN

The adapter then os.Getenv("REPLICATE_API_TOKEN") at construction and fails fast if unset. Tokens never go through imagen.yaml in plaintext.

How the /imagine skill calls into imagen

The skill (issue #4) wraps imagen generate and post-processes the path it prints on stdout. Slash-command surface area:

/imagine "a cat in a fishbowl" --style blog-header --size 1024x1024

The skill resolves to imagen generate "<prompt>" --backend <default> … and returns the image path so otto can attach it to a chat reply.

References

  • mAi project conventions: ~/.m/docs/msystem.md
  • Backend follow-ups: ImaGen issues #2 (ComfyUI on mRock), #3 (Replicate), #4 (skill)
  • mRock GPU: NVIDIA RTX 4070 Ti SUPER, 16 GB VRAM, runs Ollama + F5-TTS

House rules

  • No technical debt. No TODOs in landed code. If something can't be done now, open an issue.
  • All user-facing strings: ASCII or proper Unicode (Umlaute), never ae/oe/ue.
  • Tests live next to the package they cover (*_test.go). No tests/ dir.
  • go build ./... and go test ./... must be clean before any commit.
  • Run task build (or make build) for the full build; both call into go build -o bin/imagen ./cmd/imagen.