Diffusers documentation

Load pipelines, models, and schedulers

You are viewing v0.18.2 version. A newer version v0.32.2 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Load pipelines, models, and schedulers

Having an easy way to use a diffusion system for inference is essential to 🧨 Diffusers. Diffusion systems often consist of multiple components like parameterized models, tokenizers, and schedulers that interact in complex ways. That is why we designed the DiffusionPipeline to wrap the complexity of the entire diffusion system into an easy-to-use API, while remaining flexible enough to be adapted for other use cases, such as loading each component individually as building blocks to assemble your own diffusion system.

Everything you need for inference or training is accessible with the from_pretrained() method.

This guide will show you how to load:

  • pipelines from the Hub and locally
  • different components into a pipeline
  • checkpoint variants such as different floating point types or non-exponential mean averaged (EMA) weights
  • models and schedulers

Diffusion Pipeline

πŸ’‘ Skip to the DiffusionPipeline explained section if you interested in learning in more detail about how the DiffusionPipeline class works.

The DiffusionPipeline class is the simplest and most generic way to load any diffusion model from the Hub. The DiffusionPipeline.from_pretrained() method automatically detects the correct pipeline class from the checkpoint, downloads and caches all the required configuration and weight files, and returns a pipeline instance ready for inference.

from diffusers import DiffusionPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
pipe = DiffusionPipeline.from_pretrained(repo_id)

You can also load a checkpoint with it’s specific pipeline class. The example above loaded a Stable Diffusion model; to get the same result, use the StableDiffusionPipeline class:

from diffusers import StableDiffusionPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(repo_id)

A checkpoint (such as CompVis/stable-diffusion-v1-4 or runwayml/stable-diffusion-v1-5) may also be used for more than one task, like text-to-image or image-to-image. To differentiate what task you want to use the checkpoint for, you have to load it directly with it’s corresponding task-specific pipeline class:

from diffusers import StableDiffusionImg2ImgPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(repo_id)

Local pipeline

To load a diffusion pipeline locally, use git-lfs to manually download the checkpoint (in this case, runwayml/stable-diffusion-v1-5) to your local disk. This creates a local folder, ./stable-diffusion-v1-5, on your disk:

git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5

Then pass the local path to from_pretrained():

from diffusers import DiffusionPipeline

repo_id = "./stable-diffusion-v1-5"
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id)

The from_pretrained() method won’t download any files from the Hub when it detects a local path, but this also means it won’t download and cache the latest changes to a checkpoint.

Swap components in a pipeline

You can customize the default components of any pipeline with another compatible component. Customization is important because:

  • Changing the scheduler is important for exploring the trade-off between generation speed and quality.
  • Different components of a model are typically trained independently and you can swap out a component with a better-performing one.
  • During finetuning, usually only some components - like the UNet or text encoder - are trained.

To find out which schedulers are compatible for customization, you can use the compatibles method:

from diffusers import DiffusionPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id)
stable_diffusion.scheduler.compatibles

Let’s use the SchedulerMixin.from_pretrained() method to replace the default PNDMScheduler with a more performant scheduler, EulerDiscreteScheduler. The subfolder="scheduler" argument is required to load the scheduler configuration from the correct subfolder of the pipeline repository.

Then you can pass the new EulerDiscreteScheduler instance to the scheduler argument in DiffusionPipeline:

from diffusers import DiffusionPipeline, EulerDiscreteScheduler, DPMSolverMultistepScheduler

repo_id = "runwayml/stable-diffusion-v1-5"

scheduler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")

stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler)

Safety checker

Diffusion models like Stable Diffusion can generate harmful content, which is why 🧨 Diffusers has a safety checker to check generated outputs against known hardcoded NSFW content. If you’d like to disable the safety checker for whatever reason, pass None to the safety_checker argument:

from diffusers import DiffusionPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, safety_checker=None)

Reuse components across pipelines

You can also reuse the same components in multiple pipelines to avoid loading the weights into RAM twice. Use the components method to save the components:

from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline

model_id = "runwayml/stable-diffusion-v1-5"
stable_diffusion_txt2img = StableDiffusionPipeline.from_pretrained(model_id)

components = stable_diffusion_txt2img.components

Then you can pass the components to another pipeline without reloading the weights into RAM:

stable_diffusion_img2img = StableDiffusionImg2ImgPipeline(**components)

You can also pass the components individually to the pipeline if you want more flexibility over which components to reuse or disable. For example, to reuse the same components in the text-to-image pipeline, except for the safety checker and feature extractor, in the image-to-image pipeline:

from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline

model_id = "runwayml/stable-diffusion-v1-5"
stable_diffusion_txt2img = StableDiffusionPipeline.from_pretrained(model_id)
stable_diffusion_img2img = StableDiffusionImg2ImgPipeline(
    vae=stable_diffusion_txt2img.vae,
    text_encoder=stable_diffusion_txt2img.text_encoder,
    tokenizer=stable_diffusion_txt2img.tokenizer,
    unet=stable_diffusion_txt2img.unet,
    scheduler=stable_diffusion_txt2img.scheduler,
    safety_checker=None,
    feature_extractor=None,
    requires_safety_checker=False,
)

Checkpoint variants

A checkpoint variant is usually a checkpoint where it’s weights are:

  • Stored in a different floating point type for lower precision and lower storage, such as torch.float16, because it only requires half the bandwidth and storage to download. You can’t use this variant if you’re continuing training or using a CPU.
  • Non-exponential mean averaged (EMA) weights which shouldn’t be used for inference. You should use these to continue finetuning a model.

πŸ’‘ When the checkpoints have identical model structures, but they were trained on different datasets and with a different training setup, they should be stored in separate repositories instead of variations (for example, stable-diffusion-v1-4 and stable-diffusion-v1-5).

Otherwise, a variant is identical to the original checkpoint. They have exactly the same serialization format (like Safetensors), model structure, and weights have identical tensor shapes.

checkpoint type weight name argument for loading weights
original diffusion_pytorch_model.bin
floating point diffusion_pytorch_model.fp16.bin variant, torch_dtype
non-EMA diffusion_pytorch_model.non_ema.bin variant

There are two important arguments to know for loading variants:

  • torch_dtype defines the floating point precision of the loaded checkpoints. For example, if you want to save bandwidth by loading a fp16 variant, you should specify torch_dtype=torch.float16 to convert the weights to fp16. Otherwise, the fp16 weights are converted to the default fp32 precision. You can also load the original checkpoint without defining the variant argument, and convert it to fp16 with torch_dtype=torch.float16. In this case, the default fp32 weights are downloaded first, and then they’re converted to fp16 after loading.

  • variant defines which files should be loaded from the repository. For example, if you want to load a non_ema variant from the diffusers/stable-diffusion-variants repository, you should specify variant="non_ema" to download the non_ema files.

from diffusers import DiffusionPipeline
import torch

# load fp16 variant
stable_diffusion = DiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16
)
# load non_ema variant
stable_diffusion = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", variant="non_ema")

To save a checkpoint stored in a different floating point type or as a non-EMA variant, use the DiffusionPipeline.save_pretrained() method and specify the variant argument. You should try and save a variant to the same folder as the original checkpoint, so you can load both from the same folder:

from diffusers import DiffusionPipeline

# save as fp16 variant
stable_diffusion.save_pretrained("runwayml/stable-diffusion-v1-5", variant="fp16")
# save as non-ema variant
stable_diffusion.save_pretrained("runwayml/stable-diffusion-v1-5", variant="non_ema")

If you don’t save the variant to an existing folder, you must specify the variant argument otherwise it’ll throw an Exception because it can’t find the original checkpoint:

# πŸ‘Ž this won't work
stable_diffusion = DiffusionPipeline.from_pretrained("./stable-diffusion-v1-5", torch_dtype=torch.float16)
# πŸ‘ this works
stable_diffusion = DiffusionPipeline.from_pretrained(
    "./stable-diffusion-v1-5", variant="fp16", torch_dtype=torch.float16
)

Models

Models are loaded from the ModelMixin.from_pretrained() method, which downloads and caches the latest version of the model weights and configurations. If the latest files are available in the local cache, from_pretrained() reuses files in the cache instead of redownloading them.

Models can be loaded from a subfolder with the subfolder argument. For example, the model weights for runwayml/stable-diffusion-v1-5 are stored in the unet subfolder:

from diffusers import UNet2DConditionModel

repo_id = "runwayml/stable-diffusion-v1-5"
model = UNet2DConditionModel.from_pretrained(repo_id, subfolder="unet")

Or directly from a repository’s directory:

from diffusers import UNet2DModel

repo_id = "google/ddpm-cifar10-32"
model = UNet2DModel.from_pretrained(repo_id)

You can also load and save model variants by specifying the variant argument in ModelMixin.from_pretrained() and ModelMixin.save_pretrained():

from diffusers import UNet2DConditionModel

model = UNet2DConditionModel.from_pretrained("runwayml/stable-diffusion-v1-5", subfolder="unet", variant="non-ema")
model.save_pretrained("./local-unet", variant="non-ema")

Schedulers

Schedulers are loaded from the SchedulerMixin.from_pretrained() method, and unlike models, schedulers are not parameterized or trained; they are defined by a configuration file.

Loading schedulers does not consume any significant amount of memory and the same configuration file can be used for a variety of different schedulers. For example, the following schedulers are compatible with StableDiffusionPipeline which means you can load the same scheduler configuration file in any of these classes:

from diffusers import StableDiffusionPipeline
from diffusers import (
    DDPMScheduler,
    DDIMScheduler,
    PNDMScheduler,
    LMSDiscreteScheduler,
    EulerDiscreteScheduler,
    EulerAncestralDiscreteScheduler,
    DPMSolverMultistepScheduler,
)

repo_id = "runwayml/stable-diffusion-v1-5"

ddpm = DDPMScheduler.from_pretrained(repo_id, subfolder="scheduler")
ddim = DDIMScheduler.from_pretrained(repo_id, subfolder="scheduler")
pndm = PNDMScheduler.from_pretrained(repo_id, subfolder="scheduler")
lms = LMSDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
euler_anc = EulerAncestralDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
euler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
dpm = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")

# replace `dpm` with any of `ddpm`, `ddim`, `pndm`, `lms`, `euler_anc`, `euler`
pipeline = StableDiffusionPipeline.from_pretrained(repo_id, scheduler=dpm)

DiffusionPipeline explained

As a class method, DiffusionPipeline.from_pretrained() is responsible for two things:

  • Download the latest version of the folder structure required for inference and cache it. If the latest folder structure is available in the local cache, DiffusionPipeline.from_pretrained() reuses the cache and won’t redownload the files.
  • Load the cached weights into the correct pipeline class - retrieved from the model_index.json file - and return an instance of it.

The pipelines underlying folder structure corresponds directly with their class instances. For example, the StableDiffusionPipeline corresponds to the folder structure in runwayml/stable-diffusion-v1-5.

from diffusers import DiffusionPipeline

repo_id = "runwayml/stable-diffusion-v1-5"
pipeline = DiffusionPipeline.from_pretrained(repo_id)
print(pipeline)

You’ll see pipeline is an instance of StableDiffusionPipeline, which consists of seven components:

StableDiffusionPipeline {
  "feature_extractor": [
    "transformers",
    "CLIPImageProcessor"
  ],
  "safety_checker": [
    "stable_diffusion",
    "StableDiffusionSafetyChecker"
  ],
  "scheduler": [
    "diffusers",
    "PNDMScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
  ],
  "tokenizer": [
    "transformers",
    "CLIPTokenizer"
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
  "vae": [
    "diffusers",
    "AutoencoderKL"
  ]
}

Compare the components of the pipeline instance to the runwayml/stable-diffusion-v1-5 folder structure, and you’ll see there is a separate folder for each of the components in the repository:

.
β”œβ”€β”€ feature_extractor
β”‚Β Β  └── preprocessor_config.json
β”œβ”€β”€ model_index.json
β”œβ”€β”€ safety_checker
β”‚Β Β  β”œβ”€β”€ config.json
β”‚Β Β  └── pytorch_model.bin
β”œβ”€β”€ scheduler
β”‚Β Β  └── scheduler_config.json
β”œβ”€β”€ text_encoder
β”‚Β Β  β”œβ”€β”€ config.json
β”‚Β Β  └── pytorch_model.bin
β”œβ”€β”€ tokenizer
β”‚Β Β  β”œβ”€β”€ merges.txt
β”‚Β Β  β”œβ”€β”€ special_tokens_map.json
β”‚Β Β  β”œβ”€β”€ tokenizer_config.json
β”‚Β Β  └── vocab.json
β”œβ”€β”€ unet
β”‚Β Β  β”œβ”€β”€ config.json
β”‚Β Β  β”œβ”€β”€ diffusion_pytorch_model.bin
└── vae
    β”œβ”€β”€ config.json
    β”œβ”€β”€ diffusion_pytorch_model.bin

You can access each of the components of the pipeline as an attribute to view its configuration:

pipeline.tokenizer
CLIPTokenizer(
    name_or_path="/root/.cache/huggingface/hub/models--runwayml--stable-diffusion-v1-5/snapshots/39593d5650112b4cc580433f6b0435385882d819/tokenizer",
    vocab_size=49408,
    model_max_length=77,
    is_fast=False,
    padding_side="right",
    truncation_side="right",
    special_tokens={
        "bos_token": AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "eos_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "unk_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
        "pad_token": "<|endoftext|>",
    },
)

Every pipeline expects a model_index.json file that tells the DiffusionPipeline:

  • which pipeline class to load from _class_name
  • which version of 🧨 Diffusers was used to create the model in _diffusers_version
  • what components from which library are stored in the subfolders (name corresponds to the component and subfolder name, library corresponds to the name of the library to load the class from, and class corresponds to the class name)
{
  "_class_name": "StableDiffusionPipeline",
  "_diffusers_version": "0.6.0",
  "feature_extractor": [
    "transformers",
    "CLIPImageProcessor"
  ],
  "safety_checker": [
    "stable_diffusion",
    "StableDiffusionSafetyChecker"
  ],
  "scheduler": [
    "diffusers",
    "PNDMScheduler"
  ],
  "text_encoder": [
    "transformers",
    "CLIPTextModel"
  ],
  "tokenizer": [
    "transformers",
    "CLIPTokenizer"
  ],
  "unet": [
    "diffusers",
    "UNet2DConditionModel"
  ],
  "vae": [
    "diffusers",
    "AutoencoderKL"
  ]
}