Broker reports per-consumer gpu_resident_mib=0 for externally-started consumers → eviction finds "no candidates" and never reclaims their VRAM #4
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Symptom
A
/v1/lease(kind=image) acquire fails withinsufficient_vrameven when an idle consumer is holding evictable VRAM. Broker log:Meanwhile
nvidia-smishowswhisper-serverholding 1996 MiB — clearly evictable (comfyuican_coexist_with: []). But/v1/statusreports it as not-resident:Every consumer reads
gpu_resident_mib=0, so eviction-candidate selection (which picks consumers with resident VRAM) finds nothing and the broker can't free space — even though stopping whisper manually frees ~2 GB and lets the lease grant.Likely cause
The broker attributes VRAM per-consumer from something other than live
nvidia-smiper-PID usage (e.g. a load/unload bookkeeping counter that is only updated when the broker itself loads/unloads a consumer). Consumers started outside the broker (whisper via its own systemd unit, mvoice/ollama pre-resident) are tracked asresident=0, so the broker won't evict them. After a broker restart this affects everything. knuth's May deploy verified eviction when the broker itself had loaded mvoice — the regression shows when consumers are externally resident.Fix direction
Drive eviction-candidate residency from live per-PID nvidia-smi (map each consumer's process/port to its actual GPU memory), not only the broker's own load/unload bookkeeping. A consumer holding real VRAM must be an eviction candidate regardless of who started it. whisper-server in particular has no HTTP unload — evict it via its
systemd_unit(stop, not restart; restart reloads the model and frees nothing).Also (smaller)
comfyui.vram_resident_mibwas 13000, which is larger than the realistically-reclaimable free VRAM on a desktop (Brave/Wayland/etc. hold ~1.5 GB that is not evictable). Lowered to 11000 in config/consumers.yaml during debugging — FLUX schnell fp8 generated fine at that budget (ComfyUI offloads T5/CLIP to RAM). Validate/keep this value.systemctl stop, notrestart.Impact
Until fixed, ImaGen restyle (which acquires a lease via #15-on-the-ImaGen-side) only succeeds when the GPU is already fairly clear; the broker will not auto-reclaim idle services that it didn't itself load. Manual
systemctl --user stop whisper-serverwas the workaround to land the first successful restyle (image produced, lineage correct).Refs