# Aligned Diffusion Model via DPO | |
Diffusion Model Aligned with thef following reward model and DPO algorithm | |
``` | |
close-sourced vlm: claude3-opus gemini-1.5 gpt-4o gpt-4v | |
open-sourced vlm: internvl-1.5 | |
score model: hps-2.1 | |
``` | |
## How to Use | |
You can load the model and perform inference as follows: | |
```python | |
from diffusers import StableDiffusionPipeline, UNet2DConditionModel | |
pretrained_model_name = "runwayml/stable-diffusion-v1-5" | |
dpo_unet = UNet2DConditionModel.from_pretrained( | |
"path/to/checkpoint", | |
subfolder='unet', | |
torch_dtype=torch.float16 | |
).to('cuda') | |
pipeline = StableDiffusionPipeline.from_pretrained(pretrained_model_name, torch_dtype=torch.float16) | |
pipeline = pipeline.to('cuda') | |
pipeline.safety_checker = None | |
pipeline.unet = dpo_unet | |
generator = torch.Generator(device='cuda') | |
generator = generator.manual_seed(1) | |
prompt = "a pink flower" | |
image = pipeline(prompt=prompt, generator=generator, guidance_scale=gs).images[0] | |
``` |