Native systemd install (matches Ollama pattern on Arch — Docker on mRock has no nvidia runtime; native venv via uv is the lighter path). The Black-Forest-Labs FLUX.1-schnell HF repo is gated, so the download script points at ungated mirrors (Comfy-Org/flux1-schnell + sirorable/flux-ae-vae) that ship the same Apache-2.0 weights. First image — cat in a fishbowl, 1024x1024, 4 steps — generated end-to-end in 9.79s via curl + workflow JSON; stored at /home/m/dev/ImaGen/poc/first-image.png on mRiver (not committed; transient PoC artefact). Go adapter is phase 2.
6.3 KiB
ComfyUI on mRock — install + ops
ImaGen's flux-schnell-local backend talks to ComfyUI on mRock at
http://mrock:8188 (Tailscale-internal). This document is the reproducible
install path from a clean mRock state.
mRock runs Arch Linux + systemd with an NVIDIA RTX 4070 Ti SUPER (16 GB
VRAM). Ollama is already a native systemd service, so ComfyUI follows the
same pattern (native Python venv + systemd unit) instead of Docker — Docker
on mRock has no nvidia runtime configured, and adding one is more invasive
than another systemd unit.
Prerequisites on mRock
- Python via
uv(already installed). - NVIDIA driver new enough for CUDA 12.4.
nvidia-smi --query-gpu=driver_versionshould show >= 550. Driver 595 is what mRock has today. - ~35 GB free on
/homefor the model files. ollama.servicerunning on port 11434 — coexistence notes below.
1. Clone ComfyUI + Python venv
mkdir -p ~/dev && cd ~/dev
git clone --depth 1 https://github.com/comfyanonymous/ComfyUI.git comfyui
cd comfyui
uv venv --python 3.12 .venv
source .venv/bin/activate.fish
# PyTorch CUDA 12.4 wheels — match the system driver
uv pip install --no-cache torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu124
uv pip install --no-cache -r requirements.txt
Verify CUDA is wired up:
.venv/bin/python -c \
"import torch; print(torch.__version__, torch.cuda.is_available(), torch.cuda.get_device_name(0))"
# expected: 2.6.0+cu124 True NVIDIA GeForce RTX 4070 Ti SUPER
2. Models — FLUX.1 schnell
The Black-Forest-Labs primary repo (black-forest-labs/FLUX.1-schnell) is
gated — curl against it without an HF token returns HTTP 401. We pull
the weights from ungated mirrors of the same Apache-2.0 release.
| File | Where it goes | Source |
|---|---|---|
flux1-schnell.safetensors (~23.8 GB, fp16) |
models/unet/ |
Comfy-Org/flux1-schnell |
ae.safetensors (~335 MB) |
models/vae/ |
sirorable/flux-ae-vae |
clip_l.safetensors (~246 MB) |
models/clip/ |
comfyanonymous/flux_text_encoders |
t5xxl_fp8_e4m3fn.safetensors (~4.9 GB) |
models/clip/ |
comfyanonymous/flux_text_encoders |
cd ~/dev/comfyui/models
curl -L -o unet/flux1-schnell.safetensors \
https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell.safetensors
curl -L -o vae/ae.safetensors \
https://huggingface.co/sirorable/flux-ae-vae/resolve/main/ae.safetensors
curl -L -o clip/clip_l.safetensors \
https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
curl -L -o clip/t5xxl_fp8_e4m3fn.safetensors \
https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors
If a new HF token is configured later (~/.cache/huggingface/token), the
official black-forest-labs/FLUX.1-schnell URL is byte-identical and can be
swapped in.
3. systemd unit
Drop /etc/systemd/system/comfyui.service:
[Unit]
Description=ComfyUI image generation server
Documentation=https://github.com/comfyanonymous/ComfyUI
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=m
Group=m
WorkingDirectory=/home/m/dev/comfyui
ExecStart=/home/m/dev/comfyui/.venv/bin/python /home/m/dev/comfyui/main.py \
--listen 0.0.0.0 --port 8188 \
--output-directory /home/m/dev/comfyui/output \
--temp-directory /home/m/dev/comfyui/temp
Restart=on-failure
RestartSec=5
TimeoutStopSec=30
NoNewPrivileges=true
PrivateTmp=true
LimitNOFILE=65535
[Install]
WantedBy=multi-user.target
Then:
sudo systemctl daemon-reload
sudo systemctl enable --now comfyui.service
systemctl status comfyui.service
The service binds 0.0.0.0:8188. Tailscale's wireguard fence is the only
auth — do not expose port 8188 to the public internet.
4. Health check
curl -fsS --max-time 5 http://mrock:8188/system_stats | jq '.devices[0]'
# expected: name "cuda:0 NVIDIA GeForce RTX 4070 Ti SUPER ...", vram_total ~16 GB
imagen backends (from a host with the ImaGen CLI installed) should also
report flux-schnell-local: ok.
5. VRAM coexistence with Ollama
mRock has 16 GB VRAM total. Ollama parks ~8 GB resident for its current
model. FLUX schnell at fp16 weights with weight_dtype=fp8_e4m3fn (the
default the adapter requests) needs roughly 10–12 GB peak for a 1024×1024
generation, so concurrent Ollama + FLUX on mRock will OOM.
Two practical options:
- Stop Ollama before generating —
sudo systemctl stop ollamafrees the GPU, run the generation,sudo systemctl start ollamaafterwards. Adequate while we don't have many concurrent users. - Move Ollama off mRock — when ImaGen is in regular use, push Ollama to another host so the GPU is dedicated. Tracked separately.
Both decisions live with whoever operates the box; the adapter does not try to manage Ollama.
6. Smoke test (direct, without the imagen CLI)
# 1) Submit a workflow
curl -fsS --max-time 30 -X POST -H 'Content-Type: application/json' \
-d @flux-schnell-workflow.json \
http://mrock:8188/prompt
# returns: {"prompt_id": "...", "number": ..., "node_errors": {}}
# 2) Poll history until the prompt completes
PID=... # from above
until curl -fsS http://mrock:8188/history/$PID | jq -e ".\"$PID\".status.completed == true" >/dev/null; do
sleep 1
done
# 3) Pull the image
NAME=$(curl -fsS http://mrock:8188/history/$PID \
| jq -r ".\"$PID\".outputs[\"9\"].images[0].filename")
curl -fsS "http://mrock:8188/view?filename=$NAME&type=output" -o /tmp/cat.png
file /tmp/cat.png # PNG image data, 1024 x 1024
The full ImaGen smoke test is in usage.md once the Go adapter ships.
Troubleshooting
vram_free< 6 GB in/system_stats: another GPU process is holding memory. Usually Ollama (sudo systemctl stop ollama).- Workflow returns
node_errorswithRequired input is missingfor CLIPLoader: text encoder filenames don't match step 2 — check thatclip_l.safetensorsandt5xxl_fp8_e4m3fn.safetensorsare inmodels/clip/, notmodels/text_encoders/. Access to model … is restrictedduring a model pull: the script is hitting a gated mirror. Use the ungated URLs from step 2.- Service won't start: check
journalctl -u comfyui --since '5 min ago'. Common cause is a stalepipinstall — re-run step 1.