Async write path for the flexsiebels owner-mode UI: flexsiebels INSERTs into imagen.jobs, the worker on mRiver claims pending rows via LISTEN/NOTIFY + 5s safety poll, runs the same generate pipeline imagen generate uses, and writes the result through internal/cloud into imagen.images. - Schema migration imagen_jobs_init: table + status CHECK + two indexes + owner-scoped RLS + grants + AFTER INSERT trigger publishing on the imagen_jobs channel via pg_notify. - internal/worker: DB-agnostic loop over a Queue interface. Drains the whole pending backlog on each wake. Job-scoped contexts are derived from Background so SIGTERM lets the in-flight generation finish (no half-state). ResetStaleRunning at startup unsticks rows left over from a previous crash. Eight unit tests cover the done / failed / missing-id / drain / NOTIFY-wake / shutdown / transient-error paths against a fake queue (no real Postgres in CI). - cmd/imagen/worker.go: pgx-backed Queue (one dedicated conn for LISTEN + UPDATE), plus the workerPipeline that reuses buildBackend + attachUsageSink + prompt.Apply + buildWriter + maybeCloudSync. The per-job owner_user_id overrides the env-level fallback so each row in imagen.images is attributed correctly. - maybeCloudSync now returns (*cloud.SyncResult, error) so the worker can link imagen.jobs.image_id to the inserted imagen.images row. The CLI generate path keeps printing its stderr summary unchanged. - scripts/imagen-worker.service + .env.example for the systemd --user unit on mRiver. EnvironmentFile lives in ~/.dotfiles and is never committed. - docs/setup-worker-mriver.md walks through installation + the spec's SQL-INSERT smoke; docs/architecture.md grows an "async write path" section. - worker_integration_test.go (env-guarded by IMAGEN_WORKER_INTEGRATION=1) drives one real job through the full pipeline against msupabase using the mock backend, then verifies imagen.images + Storage object landed and the row flipped to done with image_id linked. Verified end-to-end: pickup latency ~7ms, total 74ms, failure path captures error text.
118 lines
4.8 KiB
Markdown
118 lines
4.8 KiB
Markdown
# ImaGen — Project Instructions
|
|
|
|
ImaGen is a model-agnostic image-generation framework. It has a single
|
|
opinionated CLI (`imagen`) that dispatches to whichever backend the user
|
|
configured — local FLUX on mRock via ComfyUI today, Replicate or DALL-E
|
|
tomorrow, something else next year. The framework owns plumbing (config,
|
|
output, naming, sidecars, prompt enrichment); each adapter owns the schema
|
|
and lifecycle of its own block in `~/.config/imagen.yaml`.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
cmd/imagen/ CLI shell — generate, worker, backends, config, serve
|
|
internal/backend/ Backend interface + Registry + Mock reference impl
|
|
internal/prompt/ Style preset registry (embedded styles.yaml)
|
|
internal/output/ Filename templating, image writer, JSON sidecar
|
|
internal/config/ YAML loader, validation, sample generator
|
|
internal/cloud/ Supabase Storage + imagen.images writer
|
|
internal/usage/ mai.imagen_usage cost-tracking sink
|
|
internal/worker/ imagen.jobs queue consumer (DB-agnostic via Queue interface)
|
|
internal/server/ HTTP stub (not implemented yet — follow-up issue)
|
|
scripts/ imagen-worker.service + env template, ComfyUI scripts
|
|
docs/ architecture.md, usage.md, setup-worker-mriver.md
|
|
```
|
|
|
|
Data flow for `imagen generate`:
|
|
|
|
1. Parse flags, load config (`internal/config`).
|
|
2. Resolve the requested **instance name** to a config block, then the block's
|
|
`type` to a registered constructor in `backend.Default`.
|
|
3. Apply style preset (`internal/prompt`) to the prompt.
|
|
4. Call `backend.Generate(ctx, Request)`. The adapter returns a `*Result`
|
|
with an image stream + metadata.
|
|
5. Stream to disk via `internal/output`. If `write_metadata_json` is on, a
|
|
sidecar `<image>.json` is written next to it.
|
|
|
|
## Backend contract
|
|
|
|
```go
|
|
type Backend interface {
|
|
Name() string
|
|
Generate(ctx context.Context, req Request) (*Result, error)
|
|
}
|
|
```
|
|
|
|
`Request` carries the cross-backend fields (prompt, negative, size, steps,
|
|
seed, style preset, free-form `BackendOpts`). `Result` returns the image
|
|
bytes via an `io.ReadCloser`, the MIME type, and a metadata map (model name,
|
|
seed actually used, latency, cost-estimate, …).
|
|
|
|
## Adding a new adapter
|
|
|
|
1. Create `internal/backend/<adapter>.go` (e.g. `comfyui.go`). Define a struct
|
|
that holds whatever the adapter needs (HTTP client, model id, token).
|
|
2. Add a constructor `func New<Adapter>(name string, cfg map[string]any) (Backend, error)`.
|
|
Read fields from `cfg` — that map is the adapter's own block from
|
|
`imagen.yaml` minus the `type:` key. Resolve secrets from env vars
|
|
(`api_token_env`, `api_key_env`) — never accept tokens inline.
|
|
3. Implement `Name()` (return the user-facing instance name) and
|
|
`Generate(ctx, Request)`.
|
|
4. In `init()` call `Register("<type-name>", New<Adapter>)`.
|
|
5. Anonymous-import the package from `cmd/imagen/main.go` if it lives in a
|
|
separate package, so the `init()` runs.
|
|
6. Add a smoke test under `internal/backend/<adapter>_test.go`. Network tests
|
|
should be guarded by `testing.Short()` or an env var.
|
|
|
|
## Config
|
|
|
|
`~/.config/imagen.yaml` (override with `--config`). Top-level keys:
|
|
|
|
- `default_backend` — instance name used when `--backend` is omitted.
|
|
- `output.directory` / `output.naming` / `output.write_metadata_json`.
|
|
- `backends:` — map of instance-name → `{type, …adapter-specific…}`.
|
|
|
|
The framework parses `type` and stuffs the rest into `BackendSpec.Raw`. The
|
|
adapter is free to define any schema it likes inside its block.
|
|
|
|
## Credentials
|
|
|
|
Never hardcode. Always reference env-var names from the config:
|
|
|
|
```yaml
|
|
flux-dev-replicate:
|
|
type: replicate
|
|
api_token_env: REPLICATE_API_TOKEN
|
|
```
|
|
|
|
The adapter then `os.Getenv("REPLICATE_API_TOKEN")` at construction and fails
|
|
fast if unset. Tokens never go through `imagen.yaml` in plaintext.
|
|
|
|
## How the `/imagine` skill calls into imagen
|
|
|
|
The skill (issue #4) wraps `imagen generate` and post-processes the path it
|
|
prints on stdout. Slash-command surface area:
|
|
|
|
```
|
|
/imagine "a cat in a fishbowl" --style blog-header --size 1024x1024
|
|
```
|
|
|
|
The skill resolves to `imagen generate "<prompt>" --backend <default> …` and
|
|
returns the image path so otto can attach it to a chat reply.
|
|
|
|
## References
|
|
|
|
- mAi project conventions: `~/.m/docs/msystem.md`
|
|
- Backend follow-ups: ImaGen issues #2 (ComfyUI on mRock), #3 (Replicate), #4 (skill)
|
|
- mRock GPU: NVIDIA RTX 4070 Ti SUPER, 16 GB VRAM, runs Ollama + F5-TTS
|
|
|
|
## House rules
|
|
|
|
- No technical debt. No TODOs in landed code. If something can't be done now,
|
|
open an issue.
|
|
- All user-facing strings: ASCII or proper Unicode (Umlaute), never `ae/oe/ue`.
|
|
- Tests live next to the package they cover (`*_test.go`). No `tests/` dir.
|
|
- `go build ./...` and `go test ./...` must be clean before any commit.
|
|
- Run `task build` (or `make build`) for the full build; both call into
|
|
`go build -o bin/imagen ./cmd/imagen`.
|