Async write path for the flexsiebels owner-mode UI: flexsiebels INSERTs into imagen.jobs, the worker on mRiver claims pending rows via LISTEN/NOTIFY + 5s safety poll, runs the same generate pipeline imagen generate uses, and writes the result through internal/cloud into imagen.images. - Schema migration imagen_jobs_init: table + status CHECK + two indexes + owner-scoped RLS + grants + AFTER INSERT trigger publishing on the imagen_jobs channel via pg_notify. - internal/worker: DB-agnostic loop over a Queue interface. Drains the whole pending backlog on each wake. Job-scoped contexts are derived from Background so SIGTERM lets the in-flight generation finish (no half-state). ResetStaleRunning at startup unsticks rows left over from a previous crash. Eight unit tests cover the done / failed / missing-id / drain / NOTIFY-wake / shutdown / transient-error paths against a fake queue (no real Postgres in CI). - cmd/imagen/worker.go: pgx-backed Queue (one dedicated conn for LISTEN + UPDATE), plus the workerPipeline that reuses buildBackend + attachUsageSink + prompt.Apply + buildWriter + maybeCloudSync. The per-job owner_user_id overrides the env-level fallback so each row in imagen.images is attributed correctly. - maybeCloudSync now returns (*cloud.SyncResult, error) so the worker can link imagen.jobs.image_id to the inserted imagen.images row. The CLI generate path keeps printing its stderr summary unchanged. - scripts/imagen-worker.service + .env.example for the systemd --user unit on mRiver. EnvironmentFile lives in ~/.dotfiles and is never committed. - docs/setup-worker-mriver.md walks through installation + the spec's SQL-INSERT smoke; docs/architecture.md grows an "async write path" section. - worker_integration_test.go (env-guarded by IMAGEN_WORKER_INTEGRATION=1) drives one real job through the full pipeline against msupabase using the mock backend, then verifies imagen.images + Storage object landed and the row flipped to done with image_id linked. Verified end-to-end: pickup latency ~7ms, total 74ms, failure path captures error text.
4.8 KiB
ImaGen — Project Instructions
ImaGen is a model-agnostic image-generation framework. It has a single
opinionated CLI (imagen) that dispatches to whichever backend the user
configured — local FLUX on mRock via ComfyUI today, Replicate or DALL-E
tomorrow, something else next year. The framework owns plumbing (config,
output, naming, sidecars, prompt enrichment); each adapter owns the schema
and lifecycle of its own block in ~/.config/imagen.yaml.
Architecture
cmd/imagen/ CLI shell — generate, worker, backends, config, serve
internal/backend/ Backend interface + Registry + Mock reference impl
internal/prompt/ Style preset registry (embedded styles.yaml)
internal/output/ Filename templating, image writer, JSON sidecar
internal/config/ YAML loader, validation, sample generator
internal/cloud/ Supabase Storage + imagen.images writer
internal/usage/ mai.imagen_usage cost-tracking sink
internal/worker/ imagen.jobs queue consumer (DB-agnostic via Queue interface)
internal/server/ HTTP stub (not implemented yet — follow-up issue)
scripts/ imagen-worker.service + env template, ComfyUI scripts
docs/ architecture.md, usage.md, setup-worker-mriver.md
Data flow for imagen generate:
- Parse flags, load config (
internal/config). - Resolve the requested instance name to a config block, then the block's
typeto a registered constructor inbackend.Default. - Apply style preset (
internal/prompt) to the prompt. - Call
backend.Generate(ctx, Request). The adapter returns a*Resultwith an image stream + metadata. - Stream to disk via
internal/output. Ifwrite_metadata_jsonis on, a sidecar<image>.jsonis written next to it.
Backend contract
type Backend interface {
Name() string
Generate(ctx context.Context, req Request) (*Result, error)
}
Request carries the cross-backend fields (prompt, negative, size, steps,
seed, style preset, free-form BackendOpts). Result returns the image
bytes via an io.ReadCloser, the MIME type, and a metadata map (model name,
seed actually used, latency, cost-estimate, …).
Adding a new adapter
- Create
internal/backend/<adapter>.go(e.g.comfyui.go). Define a struct that holds whatever the adapter needs (HTTP client, model id, token). - Add a constructor
func New<Adapter>(name string, cfg map[string]any) (Backend, error). Read fields fromcfg— that map is the adapter's own block fromimagen.yamlminus thetype:key. Resolve secrets from env vars (api_token_env,api_key_env) — never accept tokens inline. - Implement
Name()(return the user-facing instance name) andGenerate(ctx, Request). - In
init()callRegister("<type-name>", New<Adapter>). - Anonymous-import the package from
cmd/imagen/main.goif it lives in a separate package, so theinit()runs. - Add a smoke test under
internal/backend/<adapter>_test.go. Network tests should be guarded bytesting.Short()or an env var.
Config
~/.config/imagen.yaml (override with --config). Top-level keys:
default_backend— instance name used when--backendis omitted.output.directory/output.naming/output.write_metadata_json.backends:— map of instance-name →{type, …adapter-specific…}.
The framework parses type and stuffs the rest into BackendSpec.Raw. The
adapter is free to define any schema it likes inside its block.
Credentials
Never hardcode. Always reference env-var names from the config:
flux-dev-replicate:
type: replicate
api_token_env: REPLICATE_API_TOKEN
The adapter then os.Getenv("REPLICATE_API_TOKEN") at construction and fails
fast if unset. Tokens never go through imagen.yaml in plaintext.
How the /imagine skill calls into imagen
The skill (issue #4) wraps imagen generate and post-processes the path it
prints on stdout. Slash-command surface area:
/imagine "a cat in a fishbowl" --style blog-header --size 1024x1024
The skill resolves to imagen generate "<prompt>" --backend <default> … and
returns the image path so otto can attach it to a chat reply.
References
- mAi project conventions:
~/.m/docs/msystem.md - Backend follow-ups: ImaGen issues #2 (ComfyUI on mRock), #3 (Replicate), #4 (skill)
- mRock GPU: NVIDIA RTX 4070 Ti SUPER, 16 GB VRAM, runs Ollama + F5-TTS
House rules
- No technical debt. No TODOs in landed code. If something can't be done now, open an issue.
- All user-facing strings: ASCII or proper Unicode (Umlaute), never
ae/oe/ue. - Tests live next to the package they cover (
*_test.go). Notests/dir. go build ./...andgo test ./...must be clean before any commit.- Run
task build(ormake build) for the full build; both call intogo build -o bin/imagen ./cmd/imagen.