ComfyUI-Mesh splits FLUX.2 and LTX across two GPUs
ComfyUI-Mesh introduces a solution for splitting large diffusion models like FLUX.2 and LTX 2.3 across two GPUs, either over a network or within the same machine, using NVIDIA's NVENC hardware to compress model activations for efficient transmission. The system consists of an 'Icarus' ComfyUI client node and a 'Daedalus' back-half server, enabling faster image generation by offloading portions of the model to a second GPU. Performance data shows significant speed improvements, particularly at higher resolutions, due to NVENC's compression capabilities reducing wire overhead.
Key Takeaways
- The project supports FLUX.2 Dev, FLUX.2 Klein 9B, and LTX 2.3 (LTX-AV 22B Dev and Distilled).
- Its wire path uses NVENC HEVC compression; the README says activations are treated like video frames and compressed by 3–10x depending on QP.
- For FLUX.2 Klein 9B at 1024×1024, the README lists 4.38 seconds with NVENC qp=18 versus 7.20 seconds in raw mode.
- At 1536×1536, the same benchmark shows 4.41 seconds with NVENC qp=18 versus 9.13 seconds raw.
- The LTX node defaults to raw mode, while "Nvenc LTX (5090 optimized)" is flagged as validated on RTX 5090 hardware and untested on older NVENC generations.
Why It Matters
The immediate takeaway is that the project turns a single GPU workstation into a split inference setup without NVLink, using NVENC to move activations instead of raw tensors. That matters because the README frames the bandwidth savings as the reason wall-clock drops from 7.20 to 4.38 seconds on FLUX.2 Klein 9B at 1024×1024, with even larger gains at 1536×1536. The broader signal is that the repo now extends the same architecture to LTX 2.3 with separate client and server nodes, plus explicit LoRA-handling rules. Watch the LTX path’s codec mode choices, especially the default raw setting and the RTX 5090-only codec profile.
Read full article at github.com