Latent blending allows you to generate smooth video transitions between two prompts. It is based on (stable diffusion 2.0)[https://stability.ai/blog/stable-diffusion-v2-release] and remixes the latent reprensetation using spherical linear interpolations. This results in imperceptible transitions, where one image slowly turns into another one.

Example 1: simple transition

(mp4), code

Example 2: inpainting transition

(mp4), code

Example 3: concatenated transition

(mp4), code

Relevant parameters

Installation

Packages

 pip install -r requirements.txt

Models

Download the Stable Diffusion 2.0 Standard Model

Download the Stable Diffusion 2.0 Inpainting Model (optional)

xformers efficient attention (copied from stability)

For more efficiency and speed on GPUs, we highly recommended installing the xformers library.

Tested on A100 with CUDA 11.4. Installation needs a somewhat recent version of nvcc and gcc/g++, obtain those, e.g., via

export CUDA_HOME=/usr/local/cuda-11.4
conda install -c nvidia/label/cuda-11.4.0 cuda-nvcc
conda install -c conda-forge gcc
conda install -c conda-forge gxx_linux-64=9.5.0

Then, run the following (compiling takes up to 30 min).

cd ..
git clone https://github.com/facebookresearch/xformers.git
cd xformers
git submodule update --init --recursive
pip install -r requirements.txt
pip install -e .
cd ../stable-diffusion

Upon successful installation, the code will automatically default to memory efficient attention for the self- and cross-attention layers in the U-Net and autoencoder.

How does it work

what makes a transition a good transition?

absence of movement
every frame looks like a credible photo