latentblending/README.md

61 lines
2.0 KiB
Markdown
Raw Normal View History

2022-12-02 11:42:09 +00:00
# What is latent blending?
2022-11-21 23:20:07 +00:00
2022-12-02 11:42:09 +00:00
Latent blending allows you to generate smooth video transitions between two prompts. It is based on (stable diffusion 2.0)[https://stability.ai/blog/stable-diffusion-v2-release] and remixes the latent reprensetation using spherical linear interpolations. This results in imperceptible transitions, where one image slowly turns into another one.
2022-11-21 23:20:07 +00:00
2022-12-02 11:42:09 +00:00
# Example 1: simple transition
2022-11-21 23:20:07 +00:00
(mp4), code
2022-12-02 11:42:09 +00:00
# Example 2: inpainting transition
2022-11-21 23:20:07 +00:00
(mp4), code
2022-12-02 11:42:09 +00:00
# Example 3: concatenated transition
2022-11-21 23:20:07 +00:00
(mp4), code
2022-12-02 11:42:09 +00:00
# Relevant parameters
2022-11-21 23:20:07 +00:00
2022-12-02 11:42:09 +00:00
# Installation
#### Packages
```commandline
pip install -r requirements.txt
```
#### Models
[Download the Stable Diffusion 2.0 Standard Model](https://huggingface.co/stabilityai/stable-diffusion-2)
[Download the Stable Diffusion 2.0 Inpainting Model (optional)](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting)
#### xformers efficient attention [(copied from stability)](https://github.com/Stability-AI/stablediffusion)
For more efficiency and speed on GPUs,
we highly recommended installing the [xformers](https://github.com/facebookresearch/xformers)
library.
Tested on A100 with CUDA 11.4.
Installation needs a somewhat recent version of nvcc and gcc/g++, obtain those, e.g., via
```commandline
export CUDA_HOME=/usr/local/cuda-11.4
conda install -c nvidia/label/cuda-11.4.0 cuda-nvcc
conda install -c conda-forge gcc
conda install -c conda-forge gxx_linux-64=9.5.0
```
Then, run the following (compiling takes up to 30 min).
```commandline
cd ..
git clone https://github.com/facebookresearch/xformers.git
cd xformers
git submodule update --init --recursive
pip install -r requirements.txt
pip install -e .
cd ../stable-diffusion
```
Upon successful installation, the code will automatically default to [memory efficient attention](https://github.com/facebookresearch/xformers)
for the self- and cross-attention layers in the U-Net and autoencoder.
# How does it work
![](animation.gif)
2022-11-21 23:20:07 +00:00
2022-12-02 11:42:09 +00:00
what makes a transition a good transition?
* absence of movement
* every frame looks like a credible photo