Path 1 architecture: one comfyui adapter, workflows as data.
- workflow_template.go: embed.FS + token substitution with type-preserving
whole-value placeholders. ${prompt} → string, ${seed} → int64,
${cfg} → float64 — no JSON round-tripping. Partial matches ignored.
- comfyui.go: refactored to load workflow from embedded FS or filesystem
path. Back-compat preserved: workflow: defaults to flux1-schnell.
- workflows/{flux1-schnell,flux2-klein,sd35-medium}.json — bundled
templates. flux1-schnell migrated from hardcoded with identical node IDs.
- compare.go: new `imagen compare` subcommand. Sequential N-backend run
(one GPU on mRock — parallel would OOM), per-backend PNG, sidecar JSON
with per-model metadata + errors, composite contact sheet via Go image
package (no ImageMagick dep).
- Sample config gains flux2-klein-local + sd35-medium-local instances.
- docs/backends.md: architecture rationale + per-model HF download paths
+ how to add a new bundled workflow + compare-harness reference.
Live smoke verified: compare mock + flux-schnell-local at 768×768 →
both PNGs written, sidecar JSON has workflow="flux1-schnell" + full
metadata, contact sheet renders. Worker contract (Request → Generate)
unchanged, so flexsiebels /imagine UI API surface preserved.
Tests: 11 existing comfyui + 6 new workflow_template + 5 new compare
tests, all green.
Adding a new model is now yaml + JSON, never Go.
12 KiB
ImaGen backends
This document covers the local-ComfyUI backend plug-in story: how adapters are layered, how to add a new model without touching Go, and the per-model setup steps for the bundled templates.
For the host-side ComfyUI install (mRock — venv, weights for the default
FLUX.1-schnell, systemd, VRAM coexistence with Ollama, smoke test against
the raw HTTP API), see setup-comfyui-mrock.md.
Architecture: Path 1 — workflow-template adapter
imagen generate and imagen compare dispatch through the comfyui
adapter, which holds the HTTP plumbing (/prompt, /history/{id}, /view,
/system_stats) and treats the workflow itself as data. Each backend
instance in imagen.yaml picks a workflow JSON via the workflow: key.
Adding a new model is yaml + JSON, never Go:
internal/backend/
comfyui.go # one adapter, all ComfyUI models
workflow_template.go # loader + token-substitution
workflows/
flux1-schnell.json # bundled templates (embedded with //go:embed)
flux2-klein.json
sd35-medium.json
Why Path 1 over per-family adapters (comfyui-flux.go, comfyui-sd3.go…)
- Workflow JSON is the natural exchange format. ComfyUI users export workflows from its GUI as JSON. Anything else means rebuilding the graph by hand in Go for every new model.
- Adding a model is a config change, not a build change. With Path 2, every new family is a Go file, a new test file, a registry entry, a new worker binary, a redeploy. Path 1 lets us land a new model with one yaml block + one JSON file + one section in this doc.
- The HTTP plumbing is identical across families.
/prompt,/history,/view, the retry policy, the "value not in list" hint, VRAM reporting — none of it depends on the workflow shape. Path 2 would duplicate that across files. - Failure isolation stays clean. The workflow loader fails at adapter
construction (
imagen backendssurfaces the error), the HTTP layer fails atGenerate, and ComfyUI's own validation surfaces missing-model hints. Each layer's error message points at the right config knob.
Path 2's argument was "each family owns its quirks (samplers, schedulers,
dual-stage etc.)". That argument doesn't survive contact with the
substitution-map design: per-family knobs are just key/value fields in the
yaml block and ${shift}/${guidance}/${cfg} placeholders in the
template. No code duplication, no inheritance to debug.
Token substitution
workflow_template.SubstituteWorkflow walks the parsed JSON and replaces
every whole-value string of the form "${key}" with the typed value from
the substitution map. Numbers stay numbers, strings stay strings — no
round-tripping through strings.Replace.
The substitution map is built per call from:
- Request fields (always present):
${prompt},${negative},${width},${height},${seed},${steps},${sampler},${scheduler},${cfg}. - Every scalar field from the yaml block (string / int / int64 /
float64 / bool), minus framework keys (
type,base_url,workflow,default_*). So${vae},${clip},${clip_l},${clip_t5},${dtype},${shift},${guidance}all become substitutable just by being in yaml. - Sensible defaults for the common optional knobs above, so a
workflow that references
${dtype}without the user setting one in yaml still substitutes cleanly (fp8_e4m3fnfor FLUX,3.0for SD3 shift, etc.). Extra defaults are ignored by workflows that don't reference them.
Partial matches (e.g. "prefix ${prompt} suffix") are deliberately not
substituted — the placeholder must be the entire value so we can preserve
its JSON type. This prevents a prompt containing literal ${seed} text
from corrupting the workflow.
Unknown placeholders (referenced in JSON but missing from the substitution map) error out before the workflow leaves the binary.
Back-compat
The workflow: field defaults to flux1-schnell if omitted. Existing
yaml blocks like the pre-#10 FLUX.1-schnell instance:
flux-schnell-local:
type: comfyui
base_url: http://mrock:8188
model: flux1-schnell.safetensors
still work unchanged — they implicitly pick up the migrated
flux1-schnell.json template, which keeps the same node IDs (6, 8, 9, 10,
11, 12, 13, 27, 30, 31) as the historical hardcoded workflow.
Bundled workflows
FLUX.1-schnell — the back-compat default
| Field | Default | Notes |
|---|---|---|
model |
flux1-schnell.safetensors |
drop in models/unet/ |
vae |
ae.safetensors |
models/vae/ |
clip_l |
clip_l.safetensors |
models/clip/ |
clip_t5 |
t5xxl_fp8_e4m3fn.safetensors |
models/clip/ |
dtype |
fp8_e4m3fn |
weight dtype for the UNet loader |
default_steps / default_cfg |
4 / 1.0 | schnell is distilled to ~4 steps |
VRAM peak ~10–12 GB at 1024×1024. Install path:
setup-comfyui-mrock.md. Already shipping.
FLUX.2 [klein] 4B — direct upgrade
Released by Black Forest Labs late 2025 / early 2026, BFL non-commercial license. The distilled 4B "klein" variant lands sub-second on the RTX 4070 Ti SUPER and shares the new Qwen-based text encoder + a re-trained VAE with the larger family.
flux2-klein-local:
type: comfyui
base_url: http://mrock:8188
workflow: flux2-klein
model: flux-2-klein-base-4b-fp8.safetensors # models/unet/
vae: flux2-vae.safetensors # models/vae/
clip: qwen_3_4b.safetensors # models/text_encoders/
dtype: fp8_e4m3fn
default_steps: 4
default_cfg: 1.0
guidance: 4.0
Model downloads (on mRock, ungated mirrors when available):
cd ~/dev/comfyui/models
curl -L -o unet/flux-2-klein-base-4b-fp8.safetensors \
https://huggingface.co/black-forest-labs/FLUX.2-klein/resolve/main/flux-2-klein-base-4b-fp8.safetensors
curl -L -o vae/flux2-vae.safetensors \
https://huggingface.co/black-forest-labs/FLUX.2-klein/resolve/main/flux2-vae.safetensors
mkdir -p text_encoders
curl -L -o text_encoders/qwen_3_4b.safetensors \
https://huggingface.co/black-forest-labs/FLUX.2-klein/resolve/main/qwen_3_4b.safetensors
BFL's primary repo is gated; if curl returns 401, configure an HF token
in ~/.cache/huggingface/token or use one of the community mirrors
(check the official model card for the current list). The filenames the
template references match BFL's canonical names — rename downloads to
match if a mirror uses different ones.
VRAM peak: ~8.5 GB (4B fp8). With Ollama parked at ~8 GB this still fits; unlike FLUX.1-schnell, klein doesn't require stopping Ollama on mRock.
SD3.5-medium — single-checkpoint variant
Stability AI's 2.5B mid-size model with bundled text encoders. The
incl_clips_t5xxlfp8scaled variant ships clip_g + clip_l + t5xxl_fp8 all
in one .safetensors, so the workflow uses CheckpointLoaderSimple
instead of separate UNet/VAE/CLIP loaders.
sd35-medium-local:
type: comfyui
base_url: http://mrock:8188
workflow: sd35-medium
model: sd3.5_medium_incl_clips_t5xxlfp8scaled.safetensors # models/checkpoints/
default_steps: 28
default_sampler: dpmpp_2m
default_scheduler: sgm_uniform
default_cfg: 4.5
shift: 3.0
Model download (on mRock):
cd ~/dev/comfyui/models
curl -L -o checkpoints/sd3.5_medium_incl_clips_t5xxlfp8scaled.safetensors \
https://huggingface.co/stabilityai/stable-diffusion-3.5-medium/resolve/main/sd3.5_medium_incl_clips_t5xxlfp8scaled.safetensors
VRAM peak: ~9.9 GB at 1024×1024. Same envelope as FLUX.1-schnell — stop Ollama before generating, restart after.
Adding a new bundled workflow
-
Export from ComfyUI: load the model in the ComfyUI GUI, build a text-to-image workflow that produces what you want, "Save (API Format)" — the file you get is the right shape.
-
Sprinkle placeholders: open the JSON and replace per-call values with
${name}tokens. Whole-value substitution only:"inputs": { "text": "${prompt}", // was "a cat sitting on a chair" "seed": "${seed}", // was 1234567 "steps": "${steps}", // was 28 "cfg": "${cfg}", "sampler_name": "${sampler}", "scheduler": "${scheduler}", "width": "${width}", "height": "${height}" }Use
${model}for the checkpoint / unet filename and any per-template knobs (${vae},${shift},${guidance},${clip}…). -
Drop it into
internal/backend/workflows/<name>.json. The//go:embed workflows/*.jsondirective inworkflow_template.gopicks it up at build time — no registry entry needed. -
Add a yaml instance in
internal/config/config.go'sSampleblock forimagen config init(and~/.config/imagen.yaml) so users discover the new backend. -
Document the model files + HF download URLs in this doc.
-
Smoke test:
imagen generate "test" --backend <new-instance> --size 1024x1024should produce an image.
Per-call overrides for sampler/scheduler/cfg go via --steps, --seed,
and (programmatic) backend.Request.BackendOpts["sampler"] /
["scheduler"] / ["cfg"]. The compare harness forwards the
constant-across-backends knobs verbatim.
Loading a workflow from disk (one-off)
Pass an absolute filesystem path as workflow: and the adapter reads it
from disk instead of the embedded FS. Handy for prototyping a new model
before committing it:
my-experimental:
type: comfyui
base_url: http://mrock:8188
workflow: /home/m/dev/comfyui/workflows/my-test.json
model: my-test-model.safetensors
The fallback chain is: filesystem path (if the string looks like a path
or ends in .json), then bundled lookup by name, then bundled lookup
with .json appended.
imagen compare: cross-backend evaluation
imagen compare "a wizard casting a spell" \
--models flux-schnell-local,flux2-klein-local,sd35-medium-local \
--size 1024x1024 \
--output ~/Pictures/imagen/compare
Per run, compare:
- creates
<output>/<YYYYMMDD-HHMMSS>-<prompt-slug>/ - dispatches each named backend sequentially (mRock has one GPU; parallel would OOM) — one backend's failure doesn't abort the run
- writes per-backend PNGs as
<prompt-slug>--<backend-slug>.png - writes
compare.jsonlisting every attempt (success + failure) with per-modelseed,latency_ms,model,vram_used_mib, fullmetadatamap, and the error string for any failure - composites a
contact-sheet.pngwith the prompt as header and each cell labelled<backend>/<latency>ms · seed <n>
Flags mirror generate: --seed, --steps, --style, --negative,
--size are shared across all backends. --no-contact-sheet skips the
composite when only the per-image PNGs and sidecar matter (e.g. for a
worker script that builds its own diff view).
Diagnostics
imagen backends shows every instance with its registration state. For
local ComfyUI, the status is currently just registered (we don't probe
the upstream HTTP endpoint at startup — the boot-helper hint kicks in on
first generation if mRock is asleep).
Per-backend errors emit at most three kinds:
- Adapter construction failure (e.g. workflow JSON not found,
missing required yaml field). Caught at
buildBackendtime:imagen: backend "<name>": <err>. - HTTP / runtime failure during Generate. Wrapped with the boot
helper for
connection refused/no such host/timeouts pointing atboot-whitetower mrockso a sleeping mRock has an obvious next step. - ComfyUI workflow-validation failure (200-with-node_errors or 400).
Surfaces with a model-not-found hint (matching
value_not_in_list+unet_name/ckpt_name) when applicable, pointing back at this doc.
Worker daemon notes
imagen worker (the imagen.jobs queue consumer) uses the same adapter
- workflow lookup as the synchronous CLI — flexsiebels'
/imagineUI INSERTs abackend = <instance>row, the worker claims it, and the underlying ComfyUI HTTP calls are identical to whatgeneratemakes. No worker-specific changes are required when a new backend lands; the config + workflow are the only state that has to be present on the worker host.
After merging a new template or yaml block:
# On the worker host (mRiver today):
systemctl --user restart imagen-worker
The daemon-rebuild trap from issue #9 still applies: if you build the
imagen binary on the dev machine and scp it over, restart the unit so
systemd picks up the new ELF.