First step of the model-agnostic image-generation framework. Lands the
plumbing other components (skill, ComfyUI/Replicate adapters, agents)
will plug into:
- internal/backend: Backend interface (Request/Result), thread-safe
Registry with init-time Register, plus a Mock reference adapter that
emits a deterministic gradient PNG for smoke tests.
- internal/config: YAML loader for ~/.config/imagen.yaml. Framework owns
default_backend + output settings + a per-backend block; each adapter
owns the schema below its own block via BackendSpec.Raw.
- internal/output: filename templating ({date}/{time}/{slug}/{seed}/
{backend}/{ext}), JSON metadata sidecar, --output override path.
- internal/prompt: embedded styles.yaml, style-preset suffix application.
- internal/server: 501 stub — HTTP surface lands in a follow-up issue.
- cmd/imagen: generate / backends / config (init|validate|path) / serve
/ version subcommands. Stdlib-only flag parsing with a small helper to
honour positional prompt args ahead of flags (matches the issue spec).
- Tests for output (slug, naming template, sidecar), backend (mock PNG
validity + determinism, registry build + duplicate panic), config
(round-trip + validation), prompt (style apply + unknown-style error).
- CLAUDE.md, README.md, docs/architecture.md, docs/usage.md, Makefile.
Acceptance criteria from #211:
1. go build ./... — clean
2. imagen backends — lists registered backends, exits 0
3. imagen generate "test prompt" --backend mock --output /tmp/x.png —
writes a 1024x1024 PNG plus an x.png.json sidecar
4. imagen config init | imagen config validate — round-trips cleanly
5. CLAUDE.md "Adding a new adapter" — six-step recipe
111 lines
3.9 KiB
Markdown
111 lines
3.9 KiB
Markdown
# ImaGen architecture
|
|
|
|
ImaGen is intentionally small. The framework owns plumbing; adapters own the
|
|
upstream API. Each adapter only ever sees its own slice of `imagen.yaml`.
|
|
|
|
## Layers
|
|
|
|
```
|
|
┌───────────────────────┐
|
|
│ cmd/imagen │ CLI dispatch
|
|
│ (or HTTP server) │
|
|
└──────────┬────────────┘
|
|
│
|
|
┌──────────▼────────────┐
|
|
│ internal/prompt │ style preset → prompt suffix
|
|
│ internal/output │ filename templating, sidecar
|
|
│ internal/config │ YAML loader, validation
|
|
└──────────┬────────────┘
|
|
│
|
|
┌──────────▼────────────┐
|
|
│ internal/backend │ Backend interface + Registry
|
|
└──────────┬────────────┘
|
|
│
|
|
┌──────────▼────────────┐
|
|
│ adapters │ ComfyUI · Replicate · OpenAI · …
|
|
│ (each one register- │ each registers a `type` name on
|
|
│ s in init()) │ `backend.Default` at init time.
|
|
└───────────────────────┘
|
|
```
|
|
|
|
## The Backend contract
|
|
|
|
```go
|
|
type Request struct {
|
|
Prompt string
|
|
NegativePrompt string
|
|
Width, Height int
|
|
Steps int
|
|
Seed int64
|
|
Style string
|
|
BackendOpts map[string]any
|
|
}
|
|
|
|
type Result struct {
|
|
ImageReader io.ReadCloser
|
|
MimeType string
|
|
Metadata map[string]any
|
|
}
|
|
|
|
type Backend interface {
|
|
Name() string
|
|
Generate(ctx context.Context, req Request) (*Result, error)
|
|
}
|
|
```
|
|
|
|
Adapters translate `Request` into whatever the upstream expects. Fields they
|
|
can't honour (e.g. `NegativePrompt` on DALL-E) are silently ignored.
|
|
|
|
## Registry
|
|
|
|
`backend.Default` holds the process-wide name → constructor map. Each adapter
|
|
calls `backend.Register("<type>", NewX)` from its `init()`. The CLI imports
|
|
`internal/backend` (which transitively triggers the mock's init) and any
|
|
extra adapter packages.
|
|
|
|
## Config flow
|
|
|
|
```
|
|
imagen.yaml
|
|
backends:
|
|
flux-schnell-local:
|
|
type: comfyui ──┐
|
|
base_url: http://mrock:8188 │ framework keeps `type`,
|
|
model: flux1-schnell.safetensors │ hands the rest to the
|
|
default_steps: 4 │ comfyui adapter as cfg map[string]any
|
|
──┘
|
|
```
|
|
|
|
The framework never inspects fields below `type`. That's the adapter's
|
|
contract with itself, expressed however the adapter wants (typed struct,
|
|
map lookups, JSON tags — its call).
|
|
|
|
## Output
|
|
|
|
```
|
|
output:
|
|
directory: ~/Pictures/imagen
|
|
naming: "{date}-{slug}-{seed}.png"
|
|
write_metadata_json: true
|
|
```
|
|
|
|
Placeholders: `{date}`, `{time}`, `{slug}` (lowercased prompt, alnum-only,
|
|
truncated to 40 chars), `{seed}`, `{backend}`, `{ext}`. The sidecar JSON
|
|
contains the prompt, backend instance name, seed, ISO timestamp, and the
|
|
`Result.Metadata` map verbatim.
|
|
|
|
## Where adapters fail fast
|
|
|
|
- Missing required field in their config block — return an error from the
|
|
constructor; the CLI surfaces it as `imagen: backend "X": <err>`.
|
|
- Unset env-var for credentials — same.
|
|
- Network errors during `Generate` — wrap and return; no retry policy yet
|
|
(decide per-adapter, or move to a shared retry helper if a pattern emerges).
|
|
|
|
## Out of scope (today)
|
|
|
|
- Image post-processing (cropping, watermarking).
|
|
- Cost-tracking (lands with the Replicate adapter, since only API backends bill).
|
|
- Multi-image `n>1` per request — backends that support it can expose it via
|
|
`BackendOpts`; the framework doesn't have a first-class field yet.
|