AnimateDiff-Lightning: High-Speed Video Gen with ComfyUI

2 min read
Modified

I will try high-speed video generation AI. Since AnimateDiff is a slight extension of image generation models, it is a method easier to run even on lower specs than models made for video generation like Stable Video Diffusion or Sora. And its successor AnimateDiff-Lightning can generate in 1/10th step count, so simply calculated it can be said to be 10 times faster.

What is AnimateDiff-Lightning?

What is AnimateDiff

I summarized it here so please read it.

AnimateDiff: Stable Diffusionを拡張した軽量動画生成モデルの仕組み

>-

blog.otama-playground.com

What is AnimateDiff-Lightning

ByteDance/AnimateDiff-Lightning · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co
SDXL-Lightning: Progressive Adversarial Diffusion Distillation

We propose a diffusion distillation method that achieves new state-of-the-art in one-step/few-step 1024px text-to-image generation based on SDXL. Our method combines progressive and adversarial distillation to achieve a balance between quality and mode coverage. In this paper, we discuss the theoretical analysis, discriminator design, model formulation, and training techniques. We open-source our distilled SDXL-Lightning models both as LoRA and full UNet weights.

arxiv.org
  • Lightning series models are based on a paper published by ByteDance in February 2024
  • Speed up with distillation method combining Adversarial Distillation and Progressive Distillation
  • Train so that student model can infer in 1 step what teacher model inferred in multiple steps
  • Perform basic training using MSE loss first, then improve image quality by adding adversarial loss

Since models seem essentially unchanged, required specs seem unchanged.

Actually Try on ComfyUI

Installation of comfyUI

Please refer to the following article for installation methods

【Stable Diffusion】ComfyUIを使って画像生成AIで遊んでみよう【導入編】

>-

blog.otama-playground.com

Preparation Steps

  1. Load workflow to comfyUI
  2. Install necessary modules to comfyUI (Via ComfyUI-Manager also OK)
  3. Place checkpoint you want to use in /models/checkpoints/
  4. Place AnimateDiff-Lightning checkpoint in /custom_nodes/ComfyUI-AnimateDiff-Evolved/models/

1. Load workflow to comfyUI

Workflow: animatediff_lightning_workflow.json

2. Install necessary modules to comfyUI

Using ComfyUI-Manager to install is easiest (git clone each into custom_nodes directory is also fine)

Since I got error with git clone method, I installed ComfyUI-Manager then installed via Manager.

3. Place checkpoint you want to use in /models/checkpoints/

Officially recommended models are ↓.

4. Place checkpoint for AnimateDiff-Lightning

ByteDance/AnimateDiff-Lightning at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Execute

If you press QueuePrompt and wait a little, video is output to video combine at bottom right. It depends on parameters and PC specs, but in my case it output in less than 10 seconds with default settings.

Execution Screen
Execution Screen

Result is here

Generation Result
Generation Result

Since I haven’t tuned parameters it’s blurry, but somehow managed to generate something plausible. (Or rather considering this is 4 steps, it’s quite amazing accuracy compared to AnimateDiff…)

End

I’m satisfied being able to run it so I’ll stop here.

If you want to try other methods related to video generation, please utilize the link collection below.

Stable Diffusionガイド:動画生成に役立つリンク集

>-

blog.otama-playground.com