mAi: #2 - phase 1 PoC: ComfyUI on mRock + first FLUX schnell image
Native systemd install (matches Ollama pattern on Arch — Docker on mRock has no nvidia runtime; native venv via uv is the lighter path). The Black-Forest-Labs FLUX.1-schnell HF repo is gated, so the download script points at ungated mirrors (Comfy-Org/flux1-schnell + sirorable/flux-ae-vae) that ship the same Apache-2.0 weights. First image — cat in a fishbowl, 1024x1024, 4 steps — generated end-to-end in 9.79s via curl + workflow JSON; stored at /home/m/dev/ImaGen/poc/first-image.png on mRiver (not committed; transient PoC artefact). Go adapter is phase 2.
This commit is contained in:
1
.gitignore
vendored
1
.gitignore
vendored
@@ -7,3 +7,4 @@
|
|||||||
.env.local
|
.env.local
|
||||||
/imagen
|
/imagen
|
||||||
/coverage.txt
|
/coverage.txt
|
||||||
|
/.m/
|
||||||
|
|||||||
181
docs/setup-comfyui-mrock.md
Normal file
181
docs/setup-comfyui-mrock.md
Normal file
@@ -0,0 +1,181 @@
|
|||||||
|
# ComfyUI on mRock — install + ops
|
||||||
|
|
||||||
|
ImaGen's `flux-schnell-local` backend talks to ComfyUI on mRock at
|
||||||
|
`http://mrock:8188` (Tailscale-internal). This document is the reproducible
|
||||||
|
install path from a clean mRock state.
|
||||||
|
|
||||||
|
mRock runs Arch Linux + systemd with an NVIDIA RTX 4070 Ti SUPER (16 GB
|
||||||
|
VRAM). Ollama is already a native systemd service, so ComfyUI follows the
|
||||||
|
same pattern (native Python venv + systemd unit) instead of Docker — Docker
|
||||||
|
on mRock has no `nvidia` runtime configured, and adding one is more invasive
|
||||||
|
than another systemd unit.
|
||||||
|
|
||||||
|
## Prerequisites on mRock
|
||||||
|
|
||||||
|
- Python via `uv` (already installed).
|
||||||
|
- NVIDIA driver new enough for CUDA 12.4. `nvidia-smi --query-gpu=driver_version`
|
||||||
|
should show >= 550. Driver 595 is what mRock has today.
|
||||||
|
- ~35 GB free on `/home` for the model files.
|
||||||
|
- `ollama.service` running on port 11434 — coexistence notes below.
|
||||||
|
|
||||||
|
## 1. Clone ComfyUI + Python venv
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mkdir -p ~/dev && cd ~/dev
|
||||||
|
git clone --depth 1 https://github.com/comfyanonymous/ComfyUI.git comfyui
|
||||||
|
cd comfyui
|
||||||
|
uv venv --python 3.12 .venv
|
||||||
|
source .venv/bin/activate.fish
|
||||||
|
|
||||||
|
# PyTorch CUDA 12.4 wheels — match the system driver
|
||||||
|
uv pip install --no-cache torch torchvision torchaudio \
|
||||||
|
--index-url https://download.pytorch.org/whl/cu124
|
||||||
|
|
||||||
|
uv pip install --no-cache -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify CUDA is wired up:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -c \
|
||||||
|
"import torch; print(torch.__version__, torch.cuda.is_available(), torch.cuda.get_device_name(0))"
|
||||||
|
# expected: 2.6.0+cu124 True NVIDIA GeForce RTX 4070 Ti SUPER
|
||||||
|
```
|
||||||
|
|
||||||
|
## 2. Models — FLUX.1 schnell
|
||||||
|
|
||||||
|
The Black-Forest-Labs primary repo (`black-forest-labs/FLUX.1-schnell`) is
|
||||||
|
**gated** — `curl` against it without an HF token returns HTTP 401. We pull
|
||||||
|
the weights from ungated mirrors of the same Apache-2.0 release.
|
||||||
|
|
||||||
|
| File | Where it goes | Source |
|
||||||
|
|------|---------------|--------|
|
||||||
|
| `flux1-schnell.safetensors` (~23.8 GB, fp16) | `models/unet/` | `Comfy-Org/flux1-schnell` |
|
||||||
|
| `ae.safetensors` (~335 MB) | `models/vae/` | `sirorable/flux-ae-vae` |
|
||||||
|
| `clip_l.safetensors` (~246 MB) | `models/clip/` | `comfyanonymous/flux_text_encoders` |
|
||||||
|
| `t5xxl_fp8_e4m3fn.safetensors` (~4.9 GB) | `models/clip/` | `comfyanonymous/flux_text_encoders` |
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd ~/dev/comfyui/models
|
||||||
|
|
||||||
|
curl -L -o unet/flux1-schnell.safetensors \
|
||||||
|
https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell.safetensors
|
||||||
|
curl -L -o vae/ae.safetensors \
|
||||||
|
https://huggingface.co/sirorable/flux-ae-vae/resolve/main/ae.safetensors
|
||||||
|
curl -L -o clip/clip_l.safetensors \
|
||||||
|
https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
|
||||||
|
curl -L -o clip/t5xxl_fp8_e4m3fn.safetensors \
|
||||||
|
https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors
|
||||||
|
```
|
||||||
|
|
||||||
|
If a new HF token is configured later (`~/.cache/huggingface/token`), the
|
||||||
|
official `black-forest-labs/FLUX.1-schnell` URL is byte-identical and can be
|
||||||
|
swapped in.
|
||||||
|
|
||||||
|
## 3. systemd unit
|
||||||
|
|
||||||
|
Drop `/etc/systemd/system/comfyui.service`:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[Unit]
|
||||||
|
Description=ComfyUI image generation server
|
||||||
|
Documentation=https://github.com/comfyanonymous/ComfyUI
|
||||||
|
After=network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=m
|
||||||
|
Group=m
|
||||||
|
WorkingDirectory=/home/m/dev/comfyui
|
||||||
|
ExecStart=/home/m/dev/comfyui/.venv/bin/python /home/m/dev/comfyui/main.py \
|
||||||
|
--listen 0.0.0.0 --port 8188 \
|
||||||
|
--output-directory /home/m/dev/comfyui/output \
|
||||||
|
--temp-directory /home/m/dev/comfyui/temp
|
||||||
|
Restart=on-failure
|
||||||
|
RestartSec=5
|
||||||
|
TimeoutStopSec=30
|
||||||
|
NoNewPrivileges=true
|
||||||
|
PrivateTmp=true
|
||||||
|
LimitNOFILE=65535
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
```
|
||||||
|
|
||||||
|
Then:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
sudo systemctl enable --now comfyui.service
|
||||||
|
systemctl status comfyui.service
|
||||||
|
```
|
||||||
|
|
||||||
|
The service binds `0.0.0.0:8188`. Tailscale's wireguard fence is the only
|
||||||
|
auth — do **not** expose port 8188 to the public internet.
|
||||||
|
|
||||||
|
## 4. Health check
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -fsS --max-time 5 http://mrock:8188/system_stats | jq '.devices[0]'
|
||||||
|
# expected: name "cuda:0 NVIDIA GeForce RTX 4070 Ti SUPER ...", vram_total ~16 GB
|
||||||
|
```
|
||||||
|
|
||||||
|
`imagen backends` (from a host with the ImaGen CLI installed) should also
|
||||||
|
report `flux-schnell-local: ok`.
|
||||||
|
|
||||||
|
## 5. VRAM coexistence with Ollama
|
||||||
|
|
||||||
|
mRock has 16 GB VRAM total. Ollama parks ~8 GB resident for its current
|
||||||
|
model. FLUX schnell at fp16 weights with `weight_dtype=fp8_e4m3fn` (the
|
||||||
|
default the adapter requests) needs roughly 10–12 GB peak for a 1024×1024
|
||||||
|
generation, so concurrent Ollama + FLUX on mRock will OOM.
|
||||||
|
|
||||||
|
Two practical options:
|
||||||
|
|
||||||
|
- **Stop Ollama before generating** — `sudo systemctl stop ollama` frees
|
||||||
|
the GPU, run the generation, `sudo systemctl start ollama` afterwards.
|
||||||
|
Adequate while we don't have many concurrent users.
|
||||||
|
- **Move Ollama off mRock** — when ImaGen is in regular use, push Ollama to
|
||||||
|
another host so the GPU is dedicated. Tracked separately.
|
||||||
|
|
||||||
|
Both decisions live with whoever operates the box; the adapter does not try
|
||||||
|
to manage Ollama.
|
||||||
|
|
||||||
|
## 6. Smoke test (direct, without the imagen CLI)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1) Submit a workflow
|
||||||
|
curl -fsS --max-time 30 -X POST -H 'Content-Type: application/json' \
|
||||||
|
-d @flux-schnell-workflow.json \
|
||||||
|
http://mrock:8188/prompt
|
||||||
|
# returns: {"prompt_id": "...", "number": ..., "node_errors": {}}
|
||||||
|
|
||||||
|
# 2) Poll history until the prompt completes
|
||||||
|
PID=... # from above
|
||||||
|
until curl -fsS http://mrock:8188/history/$PID | jq -e ".\"$PID\".status.completed == true" >/dev/null; do
|
||||||
|
sleep 1
|
||||||
|
done
|
||||||
|
|
||||||
|
# 3) Pull the image
|
||||||
|
NAME=$(curl -fsS http://mrock:8188/history/$PID \
|
||||||
|
| jq -r ".\"$PID\".outputs[\"9\"].images[0].filename")
|
||||||
|
curl -fsS "http://mrock:8188/view?filename=$NAME&type=output" -o /tmp/cat.png
|
||||||
|
file /tmp/cat.png # PNG image data, 1024 x 1024
|
||||||
|
```
|
||||||
|
|
||||||
|
The full ImaGen smoke test is in [usage.md](usage.md) once the Go adapter
|
||||||
|
ships.
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
- **`vram_free` < 6 GB in `/system_stats`**: another GPU process is holding
|
||||||
|
memory. Usually Ollama (`sudo systemctl stop ollama`).
|
||||||
|
- **Workflow returns `node_errors` with `Required input is missing` for
|
||||||
|
CLIPLoader**: text encoder filenames don't match step 2 — check that
|
||||||
|
`clip_l.safetensors` and `t5xxl_fp8_e4m3fn.safetensors` are in
|
||||||
|
`models/clip/`, not `models/text_encoders/`.
|
||||||
|
- **`Access to model … is restricted`** during a model pull: the script is
|
||||||
|
hitting a gated mirror. Use the ungated URLs from step 2.
|
||||||
|
- **Service won't start**: check `journalctl -u comfyui --since '5 min ago'`.
|
||||||
|
Common cause is a stale `pip` install — re-run step 1.
|
||||||
24
scripts/comfyui.service
Normal file
24
scripts/comfyui.service
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=ComfyUI image generation server
|
||||||
|
Documentation=https://github.com/comfyanonymous/ComfyUI
|
||||||
|
After=network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=m
|
||||||
|
Group=m
|
||||||
|
WorkingDirectory=/home/m/dev/comfyui
|
||||||
|
ExecStart=/home/m/dev/comfyui/.venv/bin/python /home/m/dev/comfyui/main.py \
|
||||||
|
--listen 0.0.0.0 --port 8188 \
|
||||||
|
--output-directory /home/m/dev/comfyui/output \
|
||||||
|
--temp-directory /home/m/dev/comfyui/temp
|
||||||
|
Restart=on-failure
|
||||||
|
RestartSec=5
|
||||||
|
TimeoutStopSec=30
|
||||||
|
NoNewPrivileges=true
|
||||||
|
PrivateTmp=true
|
||||||
|
LimitNOFILE=65535
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
37
scripts/download-flux-schnell.sh
Executable file
37
scripts/download-flux-schnell.sh
Executable file
@@ -0,0 +1,37 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Download FLUX.1 schnell + accompanying VAE/text encoders into a ComfyUI tree.
|
||||||
|
# Uses ungated mirrors — the official Black-Forest-Labs repo is gated and
|
||||||
|
# requires an HF token. See docs/setup-comfyui-mrock.md.
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
ROOT="${1:-$HOME/dev/comfyui/models}"
|
||||||
|
|
||||||
|
if [ ! -d "$ROOT" ]; then
|
||||||
|
echo "models root $ROOT does not exist — pass it as the first argument" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
mkdir -p "$ROOT/unet" "$ROOT/vae" "$ROOT/clip"
|
||||||
|
|
||||||
|
CKPT="https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell.safetensors"
|
||||||
|
VAE="https://huggingface.co/sirorable/flux-ae-vae/resolve/main/ae.safetensors"
|
||||||
|
CLIP_L="https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors"
|
||||||
|
T5="https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors"
|
||||||
|
|
||||||
|
dl() {
|
||||||
|
local url=$1 dest=$2
|
||||||
|
if [ -s "$dest" ]; then
|
||||||
|
echo "skip $dest (already present)"
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
echo "downloading $url -> $dest"
|
||||||
|
curl -L --fail --retry 3 --retry-delay 5 -C - -o "$dest" "$url"
|
||||||
|
}
|
||||||
|
|
||||||
|
dl "$CKPT" "$ROOT/unet/flux1-schnell.safetensors"
|
||||||
|
dl "$VAE" "$ROOT/vae/ae.safetensors"
|
||||||
|
dl "$CLIP_L" "$ROOT/clip/clip_l.safetensors"
|
||||||
|
dl "$T5" "$ROOT/clip/t5xxl_fp8_e4m3fn.safetensors"
|
||||||
|
|
||||||
|
echo "done"
|
||||||
87
scripts/flux-schnell-poc.json
Normal file
87
scripts/flux-schnell-poc.json
Normal file
@@ -0,0 +1,87 @@
|
|||||||
|
{
|
||||||
|
"prompt": {
|
||||||
|
"6": {
|
||||||
|
"class_type": "CLIPTextEncode",
|
||||||
|
"inputs": {
|
||||||
|
"text": "a small fishbowl with a cat staring out, photo, soft light",
|
||||||
|
"clip": ["11", 0]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"8": {
|
||||||
|
"class_type": "VAEDecode",
|
||||||
|
"inputs": {
|
||||||
|
"samples": ["31", 0],
|
||||||
|
"vae": ["10", 0]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"9": {
|
||||||
|
"class_type": "SaveImage",
|
||||||
|
"inputs": {
|
||||||
|
"filename_prefix": "imagen-poc",
|
||||||
|
"images": ["8", 0]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"10": {
|
||||||
|
"class_type": "VAELoader",
|
||||||
|
"inputs": {
|
||||||
|
"vae_name": "ae.safetensors"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"11": {
|
||||||
|
"class_type": "DualCLIPLoader",
|
||||||
|
"inputs": {
|
||||||
|
"clip_name1": "t5xxl_fp8_e4m3fn.safetensors",
|
||||||
|
"clip_name2": "clip_l.safetensors",
|
||||||
|
"type": "flux"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"12": {
|
||||||
|
"class_type": "UNETLoader",
|
||||||
|
"inputs": {
|
||||||
|
"unet_name": "flux1-schnell.safetensors",
|
||||||
|
"weight_dtype": "fp8_e4m3fn"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"13": {
|
||||||
|
"class_type": "CLIPTextEncode",
|
||||||
|
"inputs": {
|
||||||
|
"text": "",
|
||||||
|
"clip": ["11", 0]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"27": {
|
||||||
|
"class_type": "EmptySD3LatentImage",
|
||||||
|
"inputs": {
|
||||||
|
"width": 1024,
|
||||||
|
"height": 1024,
|
||||||
|
"batch_size": 1
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"30": {
|
||||||
|
"class_type": "ModelSamplingFlux",
|
||||||
|
"inputs": {
|
||||||
|
"model": ["12", 0],
|
||||||
|
"max_shift": 1.15,
|
||||||
|
"base_shift": 0.5,
|
||||||
|
"width": 1024,
|
||||||
|
"height": 1024
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"31": {
|
||||||
|
"class_type": "KSampler",
|
||||||
|
"inputs": {
|
||||||
|
"model": ["30", 0],
|
||||||
|
"seed": 1234567,
|
||||||
|
"steps": 4,
|
||||||
|
"cfg": 1.0,
|
||||||
|
"sampler_name": "euler",
|
||||||
|
"scheduler": "simple",
|
||||||
|
"denoise": 1.0,
|
||||||
|
"positive": ["6", 0],
|
||||||
|
"negative": ["13", 0],
|
||||||
|
"latent_image": ["27", 0]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"client_id": "imagen-poc-001"
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user