FLUX.1-Turbo-Alpha / README_ZH.md
zqp's picture
update readme
1bece31
metadata
license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
language:
  - en
base_model: black-forest-labs/FLUX.1-dev
library_name: diffusers
tags:
  - Text-to-Image
  - FLUX
  - Stable Diffusion
pipeline_tag: text-to-image
alibaba alimama

本仓库包含了由阿里妈妈创意团队开发的基于FLUX.1-dev模型的8步蒸馏版。

介绍

该模型是基于FLUX.1-dev模型的8步蒸馏版lora。我们使用特殊设计的判别器来提高蒸馏质量。该模型可以用于T2I、Inpainting controlnet和其他FLUX相关模型。建议guidance_scale=3.5和lora_scale=1。我们的更低步数的版本将在后续发布。

  • Text-to-Image.

使用指南

diffusers

该模型可以直接与diffusers一起使用

import torch
from diffusers.pipelines import FluxPipeline

model_id = "black-forest-labs/FLUX.1-dev"
adapter_id = "alimama-creative/FLUX.1-Turbo-Alpha"

pipe = FluxPipeline.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16
)
pipe.to("cuda")

pipe.load_lora_weights(adapter_id)
pipe.fuse_lora()

prompt = "A DSLR photo of a shiny VW van that has a cityscape painted on it. A smiling sloth stands on grass in front of the van and is wearing a leather jacket, a cowboy hat, a kilt and a bowtie. The sloth is holding a quarterstaff and a big book."
image = pipe(
            prompt=prompt,
            guidance_scale=3.5,
            height=1024,
            width=1024,
            num_inference_steps=8,
            max_sequence_length=512).images[0]

comfyui

训练细节

该模型在1M公开数据集和内部源图片上进行训练,这些数据美学评分6.3+而且分辨率大于800。我们使用对抗训练来提高质量,我们的方法将原始FLUX.1-dev transformer固定为判别器的特征提取器,并在每个transformer层中添加判别头网络。在训练期间,我们将guidance scale固定为3.5,并使用时间偏移量3。

混合精度: bf16

学习率: 2e-5

批大小: 64

训练分辨率: 1024x1024