# ImaGen — Project Instructions ImaGen is a model-agnostic image-generation framework. It has a single opinionated CLI (`imagen`) that dispatches to whichever backend the user configured — local FLUX on mRock via ComfyUI today, Replicate or DALL-E tomorrow, something else next year. The framework owns plumbing (config, output, naming, sidecars, prompt enrichment); each adapter owns the schema and lifecycle of its own block in `~/.config/imagen.yaml`. ## Architecture ``` cmd/imagen/ CLI shell — generate, worker, backends, config, serve internal/backend/ Backend interface + Registry + Mock reference impl internal/prompt/ Style preset registry (embedded styles.yaml) internal/output/ Filename templating, image writer, JSON sidecar internal/config/ YAML loader, validation, sample generator internal/cloud/ Supabase Storage + imagen.images writer internal/usage/ mai.imagen_usage cost-tracking sink internal/worker/ imagen.jobs queue consumer (DB-agnostic via Queue interface) internal/server/ HTTP stub (not implemented yet — follow-up issue) scripts/ imagen-worker.service + env template, ComfyUI scripts docs/ architecture.md, usage.md, setup-worker-mriver.md ``` Data flow for `imagen generate`: 1. Parse flags, load config (`internal/config`). 2. Resolve the requested **instance name** to a config block, then the block's `type` to a registered constructor in `backend.Default`. 3. Apply style preset (`internal/prompt`) to the prompt. 4. Call `backend.Generate(ctx, Request)`. The adapter returns a `*Result` with an image stream + metadata. 5. Stream to disk via `internal/output`. If `write_metadata_json` is on, a sidecar `.json` is written next to it. ## Backend contract ```go type Backend interface { Name() string Generate(ctx context.Context, req Request) (*Result, error) } ``` `Request` carries the cross-backend fields (prompt, negative, size, steps, seed, style preset, free-form `BackendOpts`). `Result` returns the image bytes via an `io.ReadCloser`, the MIME type, and a metadata map (model name, seed actually used, latency, cost-estimate, …). ## Adding a new adapter 1. Create `internal/backend/.go` (e.g. `comfyui.go`). Define a struct that holds whatever the adapter needs (HTTP client, model id, token). 2. Add a constructor `func New(name string, cfg map[string]any) (Backend, error)`. Read fields from `cfg` — that map is the adapter's own block from `imagen.yaml` minus the `type:` key. Resolve secrets from env vars (`api_token_env`, `api_key_env`) — never accept tokens inline. 3. Implement `Name()` (return the user-facing instance name) and `Generate(ctx, Request)`. 4. In `init()` call `Register("", New)`. 5. Anonymous-import the package from `cmd/imagen/main.go` if it lives in a separate package, so the `init()` runs. 6. Add a smoke test under `internal/backend/_test.go`. Network tests should be guarded by `testing.Short()` or an env var. ## Config `~/.config/imagen.yaml` (override with `--config`). Top-level keys: - `default_backend` — instance name used when `--backend` is omitted. - `output.directory` / `output.naming` / `output.write_metadata_json`. - `backends:` — map of instance-name → `{type, …adapter-specific…}`. The framework parses `type` and stuffs the rest into `BackendSpec.Raw`. The adapter is free to define any schema it likes inside its block. ## Credentials Never hardcode. Always reference env-var names from the config: ```yaml flux-dev-replicate: type: replicate api_token_env: REPLICATE_API_TOKEN ``` The adapter then `os.Getenv("REPLICATE_API_TOKEN")` at construction and fails fast if unset. Tokens never go through `imagen.yaml` in plaintext. ## How the `/imagine` skill calls into imagen The skill (issue #4) wraps `imagen generate` and post-processes the path it prints on stdout. Slash-command surface area: ``` /imagine "a cat in a fishbowl" --style blog-header --size 1024x1024 ``` The skill resolves to `imagen generate "" --backend …` and returns the image path so otto can attach it to a chat reply. ## References - mAi project conventions: `~/.m/docs/msystem.md` - Backend follow-ups: ImaGen issues #2 (ComfyUI on mRock), #3 (Replicate), #4 (skill) - mRock GPU: NVIDIA RTX 4070 Ti SUPER, 16 GB VRAM, runs Ollama + F5-TTS ## House rules - No technical debt. No TODOs in landed code. If something can't be done now, open an issue. - All user-facing strings: ASCII or proper Unicode (Umlaute), never `ae/oe/ue`. - Tests live next to the package they cover (`*_test.go`). No `tests/` dir. - `go build ./...` and `go test ./...` must be clean before any commit. - Run `task build` (or `make build`) for the full build; both call into `go build -o bin/imagen ./cmd/imagen`.