wanx-troopers

Kandinsky-5

2025.12.25

Musubi Tuner just received K5 Pro support.

2025.12.23

official 5sec pro T2V and I2V distilled models are available.
no distilled 10sec pro T2V is available

2025.12.22

List of current speed-ups: “T2V 5s distill with Easy Cache and Sage Attention v1” + “NABLA patch (PR# 11371)”.

not the way of doing things in comfy, there’s attention patch mechanic that should be used

NABLA node is apparently present in Kijai’s Wrapper repo, uses torch compile ??

2025.12.17

Waiting for PR 11371 to be merged in adding/restoring Nabla attention for 10sec K5, 2.7x speedup.

2025.12.10

Distilled Kandinsky-5 versions have become available

2025.12.06

Kijai’s PR to add Kandinsky-5 support to ComfyUI native has been merged in

Presently there are just two nodes in ComfyUI which have are used exclisively by Kandinsky-5 workflows:

ClipTextEncodeKandinsky5
Kandisnky5ImageToVideo

Earlier

Kijai’s wip wf.

DiT-Extrapoloatin formerly known as RifleX is partially implemented for Kandinsky-5

that’s how the model works for I2V, the first latent is just noise and it has to be replaced by the actual input image before decoding

Experiments with running the model in ComfyUI on consumer hardware are in their initial stages. There currently exist two work-in-progress implementations:

Experiments suggest it is possible to generate longer than 10 sec clips, say 15 sec without looping or obvious quality problems.

some parts of the model need to be kept in fp32, the norms and embeddings etc. This is the mixed i2v model: https://huggingface.co/maybleMyers/kan/blob/main/diffusion_pytorch_model_i2v_pro_fp32_and_bf16.safetensors

kijai/ComfyUI-KJNodes contains NABLA Attention KJ node: “only useful if you go 10s or high res”; “Docs mention NABLA dimensions must be divisible by 128”

“Flex attention” is mentioned as an alternative (?) to Nabla.

Kandinsky-5 is using HunyuanVideo 1.0 latent space and VAE.