README.md · dattrong/pixart-sigma-256 at main

metadata

license: openrail++
tags:
  - text-to-image
  - PixArt-Σ

THIS IS A REDISTRIBUTION OF PIXART-Σ-XL-256x256

🧨 Diffusers

Make sure to upgrade diffusers to >= 0.28.0:
pip install -U diffusers --upgrade
In addition make sure to install transformers, safetensors, sentencepiece, and accelerate:
pip install transformers accelerate safetensors sentencepiece
For diffusers<0.28.0, check this script for help.

To just use the base model, you can run:

import torch
from diffusers import Transformer2DModel, PixArtSigmaPipeline

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
weight_dtype = torch.float16

pipe = PixArtSigmaPipeline.from_pretrained(
    "dattrong/pixart-sigma-512", 
    torch_dtype=weight_dtype,
    use_safetensors=True,
)
pipe.to(device)

# Enable memory optimizations.
# pipe.enable_model_cpu_offload()

prompt = "A small cactus with a happy face in the Sahara desert."
image = pipe(prompt).images[0]
image.save("./catcus.png")

When using torch >= 2.0, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:

pipe.transformer = torch.compile(pipe.transformer, mode="reduce-overhead", fullgraph=True)

If you are limited by GPU VRAM, you can enable cpu offloading by calling pipe.enable_model_cpu_offload instead of .to("cuda"):

- pipe.to("cuda")
+ pipe.enable_model_cpu_offload()