Files
ImaGen/CLAUDE.md
mAi 237270b204 mAi: #211 - bootstrap ImaGen framework skeleton
First step of the model-agnostic image-generation framework. Lands the
plumbing other components (skill, ComfyUI/Replicate adapters, agents)
will plug into:

- internal/backend: Backend interface (Request/Result), thread-safe
  Registry with init-time Register, plus a Mock reference adapter that
  emits a deterministic gradient PNG for smoke tests.
- internal/config: YAML loader for ~/.config/imagen.yaml. Framework owns
  default_backend + output settings + a per-backend block; each adapter
  owns the schema below its own block via BackendSpec.Raw.
- internal/output: filename templating ({date}/{time}/{slug}/{seed}/
  {backend}/{ext}), JSON metadata sidecar, --output override path.
- internal/prompt: embedded styles.yaml, style-preset suffix application.
- internal/server: 501 stub — HTTP surface lands in a follow-up issue.
- cmd/imagen: generate / backends / config (init|validate|path) / serve
  / version subcommands. Stdlib-only flag parsing with a small helper to
  honour positional prompt args ahead of flags (matches the issue spec).
- Tests for output (slug, naming template, sidecar), backend (mock PNG
  validity + determinism, registry build + duplicate panic), config
  (round-trip + validation), prompt (style apply + unknown-style error).
- CLAUDE.md, README.md, docs/architecture.md, docs/usage.md, Makefile.

Acceptance criteria from #211:
1. go build ./... — clean
2. imagen backends — lists registered backends, exits 0
3. imagen generate "test prompt" --backend mock --output /tmp/x.png —
   writes a 1024x1024 PNG plus an x.png.json sidecar
4. imagen config init | imagen config validate — round-trips cleanly
5. CLAUDE.md "Adding a new adapter" — six-step recipe
2026-05-08 14:37:05 +02:00

114 lines
4.5 KiB
Markdown

# ImaGen — Project Instructions
ImaGen is a model-agnostic image-generation framework. It has a single
opinionated CLI (`imagen`) that dispatches to whichever backend the user
configured — local FLUX on mRock via ComfyUI today, Replicate or DALL-E
tomorrow, something else next year. The framework owns plumbing (config,
output, naming, sidecars, prompt enrichment); each adapter owns the schema
and lifecycle of its own block in `~/.config/imagen.yaml`.
## Architecture
```
cmd/imagen/ CLI shell — generate, backends, config, serve
internal/backend/ Backend interface + Registry + Mock reference impl
internal/prompt/ Style preset registry (embedded styles.yaml)
internal/output/ Filename templating, image writer, JSON sidecar
internal/config/ YAML loader, validation, sample generator
internal/server/ HTTP stub (not implemented yet — follow-up issue)
docs/ architecture.md, usage.md
```
Data flow for `imagen generate`:
1. Parse flags, load config (`internal/config`).
2. Resolve the requested **instance name** to a config block, then the block's
`type` to a registered constructor in `backend.Default`.
3. Apply style preset (`internal/prompt`) to the prompt.
4. Call `backend.Generate(ctx, Request)`. The adapter returns a `*Result`
with an image stream + metadata.
5. Stream to disk via `internal/output`. If `write_metadata_json` is on, a
sidecar `<image>.json` is written next to it.
## Backend contract
```go
type Backend interface {
Name() string
Generate(ctx context.Context, req Request) (*Result, error)
}
```
`Request` carries the cross-backend fields (prompt, negative, size, steps,
seed, style preset, free-form `BackendOpts`). `Result` returns the image
bytes via an `io.ReadCloser`, the MIME type, and a metadata map (model name,
seed actually used, latency, cost-estimate, …).
## Adding a new adapter
1. Create `internal/backend/<adapter>.go` (e.g. `comfyui.go`). Define a struct
that holds whatever the adapter needs (HTTP client, model id, token).
2. Add a constructor `func New<Adapter>(name string, cfg map[string]any) (Backend, error)`.
Read fields from `cfg` — that map is the adapter's own block from
`imagen.yaml` minus the `type:` key. Resolve secrets from env vars
(`api_token_env`, `api_key_env`) — never accept tokens inline.
3. Implement `Name()` (return the user-facing instance name) and
`Generate(ctx, Request)`.
4. In `init()` call `Register("<type-name>", New<Adapter>)`.
5. Anonymous-import the package from `cmd/imagen/main.go` if it lives in a
separate package, so the `init()` runs.
6. Add a smoke test under `internal/backend/<adapter>_test.go`. Network tests
should be guarded by `testing.Short()` or an env var.
## Config
`~/.config/imagen.yaml` (override with `--config`). Top-level keys:
- `default_backend` — instance name used when `--backend` is omitted.
- `output.directory` / `output.naming` / `output.write_metadata_json`.
- `backends:` — map of instance-name → `{type, …adapter-specific…}`.
The framework parses `type` and stuffs the rest into `BackendSpec.Raw`. The
adapter is free to define any schema it likes inside its block.
## Credentials
Never hardcode. Always reference env-var names from the config:
```yaml
flux-dev-replicate:
type: replicate
api_token_env: REPLICATE_API_TOKEN
```
The adapter then `os.Getenv("REPLICATE_API_TOKEN")` at construction and fails
fast if unset. Tokens never go through `imagen.yaml` in plaintext.
## How the `/imagine` skill calls into imagen
The skill (issue #4) wraps `imagen generate` and post-processes the path it
prints on stdout. Slash-command surface area:
```
/imagine "a cat in a fishbowl" --style blog-header --size 1024x1024
```
The skill resolves to `imagen generate "<prompt>" --backend <default> …` and
returns the image path so otto can attach it to a chat reply.
## References
- mAi project conventions: `~/.m/docs/msystem.md`
- Backend follow-ups: ImaGen issues #2 (ComfyUI on mRock), #3 (Replicate), #4 (skill)
- mRock GPU: NVIDIA RTX 4070 Ti SUPER, 16 GB VRAM, runs Ollama + F5-TTS
## House rules
- No technical debt. No TODOs in landed code. If something can't be done now,
open an issue.
- All user-facing strings: ASCII or proper Unicode (Umlaute), never `ae/oe/ue`.
- Tests live next to the package they cover (`*_test.go`). No `tests/` dir.
- `go build ./...` and `go test ./...` must be clean before any commit.
- Run `task build` (or `make build`) for the full build; both call into
`go build -o bin/imagen ./cmd/imagen`.