Create butter-smooth transitions between prompts, powered by stable diffusion
Go to file
Johannes Stelzer 28f5a1b8be docstrings 2023-01-09 09:58:18 +01:00
configs stable diffusion v2 configs 2022-12-31 13:14:25 +01:00
ldm sd 2.1 2022-12-09 11:06:44 +00:00
.gitignore mp4 files 2022-11-28 08:45:40 +01:00
LICENSE Initial commit 2022-11-19 19:40:58 +01:00
README.md Update README.md 2023-01-09 09:50:15 +01:00
animation.gif calmer animation 2022-12-01 09:23:31 +00:00
example1.jpg example imgs 2023-01-09 08:17:07 +01:00
example1_standard.py example1 upd 2023-01-09 08:52:53 +01:00
example2.jpg example imgs 2023-01-09 08:17:07 +01:00
example2_inpaint.py new branching setup 2023-01-08 10:33:11 +01:00
example3_multitrans.py new branching setup 2023-01-08 10:33:11 +01:00
gradio_ui.py negative prompts 2023-01-08 11:48:44 +01:00
latent_blending.py branching bug 2023-01-08 12:18:54 +01:00
movie_util.py extension 2023-01-04 17:38:06 +01:00
requirements.txt cleaned 2023-01-02 09:55:35 +01:00
stable_diffusion_holder.py docstrings 2023-01-09 09:58:18 +01:00

README.md

Latent blending enables the creation of super-smooth video transitions between prompts. Powered by stable diffusion 2.1, this method involves specific mixing of intermediate latent representations to create a seamless transition with users having the option to choose full customization or preset options.

Quickstart

fp_ckpt = 'path_to_SD2.ckpt'
fp_config = 'path_to_config.yaml'

sdh = StableDiffusionHolder(fp_ckpt, fp_config, 'cuda')
lb = LatentBlending(sdh)

lb.load_branching_profile(quality='medium', depth_strength=0.4)
lb.set_prompt1('photo of my first prompt1')
lb.set_prompt2('photo of my second prompt')

imgs_transition = lb.run_transition()

Gradio UI

To run the UI on your local machine, run gradio_ui.py

Example 1: Simple transition

To run a simple transition between two prompts, run example1_standard.py

Example 2: Inpainting transition

To run a transition between two prompts where you want some part of the image to remain static, run example2_inpaint.py

Example 3: Multi transition

To run multiple transition between K prompts, resulting in a stitched video, run example3_multitrans.py

Customization

Most relevant parameters

Change the height/width

lb.set_height(512)
lb.set_width(1024)

Change guidance scale

lb.set_guidance_scale(5.0)

depth_strength / list_injection_strength

The strength dictates how early the blending process starts. The closer its value is to zero, the more inventive the results will be; whereas, a value closer to one indicates a more simple alpha blending.

Set up the branching structure

There are three ways to change the branching structure.

Presets

quality = 'medium' #choose from lowest, low, medium, high, ultra
depth_strength = 0.5 # see above (Most relevant parameters)

lb.load_branching_profile(quality, depth_strength)

Autosetup tree setup

num_inference_steps = 30 # the number of diffusion steps
list_nmb_branches = [2, 4, 8, 20]
list_injection_strength = [0.0, 0.3, 0.5, 0.9]

lb.autosetup_branching(num_inference_steps, list_nmb_branches, list_injection_strength)

Fully manual

depth_strength = 0.5 # see above (Most relevant parameters)
num_inference_steps = 30 # the number of diffusion steps
nmb_branches_final = 20 # how many diffusion images will be generated for the transition

lb.setup_branching(depth_strength, num_inference_steps, nmb_branches_final)

Installation

Packages

pip install -r requirements.txt

Download Models from Huggingface

Download the Stable Diffusion v2-1_768 Model

Download the Stable Diffusion Inpainting Model

Download the Stable Diffusion x4 Upscaler

With xformers, stable diffusion will run faster with smaller memory inprint. Necessary for higher resolutions / upscaling model.

conda install xformers -c xformers/label/dev

Alternatively, you can build it from source:

# (Optional) Makes the build much faster
pip install ninja
# Set TORCH_CUDA_ARCH_LIST if running and building on different GPU types
pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers
# (this can take dozens of minutes)

How does it work

what makes a transition a good transition?

  • absence of movement
  • every frame looks like a credible photo