jimmycarter
/

LibreFLUX

Text-to-Image

Diffusers

Safetensors

FluxPipeline

Inference Endpoints

Model card Files Files and versions Community

jimmycarter commited on Oct 20, 2024

Commit

59ed1db

verified ·

1 Parent(s): 79ba5f4

Upload README.md

Browse files

Files changed (1) hide show

README.md +10 -2

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ pipeline_tag: text-to-image
 # LibreFLUX: A free, de-distilled FLUX model
-LibreFLUX is an Apache 2.0 version of [FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) that provides a full T5 context length, uses attention masking, has classifier free guidance restored, and has had most of the FLUX aesthetic finetuning/DPO fully removed. That means it's a lot uglier than base flux, but it has the potential to be more easily finetuned to any new distribution. It keeps in mind the core tenets of open source software, that it should be difficult to use, slower and clunkier than a proprietary solution, and have an aesthetic trapped somewhere inside the early 2000s.
 <img src="https://huggingface.co/jimmycarter/LibreFLUX/resolve/main/assets/splash.jpg" style="max-width: 100%;">
@@ -16,6 +16,8 @@ LibreFLUX is an Apache 2.0 version of [FLUX.1-schnell](https://huggingface.co/bl
 - [LibreFLUX: A free, de-distilled FLUX model](#libreflux-a-free-de-distilled-flux-model)
 - [Usage](#usage)
 - [Non-technical Report on Schnell De-distillation](#non-technical-report-on-schnell-de-distillation)
   - [Why](#why)
   - [Restoring the Original Training Objective](#restoring-the-original-training-objective)
@@ -33,6 +35,8 @@ LibreFLUX is an Apache 2.0 version of [FLUX.1-schnell](https://huggingface.co/bl
 # Usage
 To use the model, just call the custom pipeline using [diffusers](https://github.com/huggingface/diffusers).
 ```py
@@ -82,6 +86,10 @@ images[0][0].save('chalkboard.png')
 For usage in ComfyUI, [a single transformer file is provided](https://huggingface.co/jimmycarter/LibreFLUX/blob/main/transformer_legacy.safetensors) but note that ComfyUI does not presently support attention masks so your images may be degraded.
 # Non-technical Report on Schnell De-distillation
 Welcome to my non-technical report on de-distilling FLUX.1-schnell in the most un-scientific way possible with extremely limited resources. I'm not going to claim I made a good model, but I did make a model. It was trained on about 1,500 H100 hour equivalents.
@@ -118,7 +126,7 @@ Note that FLUX.1-schnell was only trained on 256 tokens, so my finetune allows u
 ## Make de-distillation go fast and fit in small GPUs
-I avoided doing any full-rank (normal, all parameters) finetuning at all, since FLUX is big. I trained initially with the model in int8 precision using [quanto](https://github.com/huggingface/optimum-quanto). I started with a 600 million parameter [LoKr](https://arxiv.org/abs/2309.14859), since LoKr tends to approximate full-rank finetuning better than LoRA. The loss was really slow to go down when I began, so after poking around the code to initialize the matrix to apply to the LoKr I settled on this function, which injects noise at a fraction of the magnitudes of the layers they apply to.
 ```py
 def approximate_normal_tensor(inp, target, scale=1.0):

 # LibreFLUX: A free, de-distilled FLUX model
+LibreFLUX is an Apache 2.0 version of [FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) that provides a full T5 context length, uses attention masking, has classifier free guidance restored, and has had most of the FLUX aesthetic fine-tuning/DPO fully removed. That means it's a lot uglier than base flux, but it has the potential to be more easily finetuned to any new distribution. It keeps in mind the core tenets of open source software, that it should be difficult to use, slower and clunkier than a proprietary solution, and have an aesthetic trapped somewhere inside the early 2000s.
 <img src="https://huggingface.co/jimmycarter/LibreFLUX/resolve/main/assets/splash.jpg" style="max-width: 100%;">
 - [LibreFLUX: A free, de-distilled FLUX model](#libreflux-a-free-de-distilled-flux-model)
 - [Usage](#usage)
+  - [Inference](#inference)
+  - [Fine-tuning](#fine-tuning)
 - [Non-technical Report on Schnell De-distillation](#non-technical-report-on-schnell-de-distillation)
   - [Why](#why)
   - [Restoring the Original Training Objective](#restoring-the-original-training-objective)
 # Usage
+## Inference
 To use the model, just call the custom pipeline using [diffusers](https://github.com/huggingface/diffusers).
 ```py
 For usage in ComfyUI, [a single transformer file is provided](https://huggingface.co/jimmycarter/LibreFLUX/blob/main/transformer_legacy.safetensors) but note that ComfyUI does not presently support attention masks so your images may be degraded.
+## Fine-tuning
+The model can be easily finetuned using [SimpleTuner](https://github.com/bghira/SimpleTuner) and the `--flux_attention_masked_training` training option. SimpleTuner has extensive support for parameter-efficient fine-tuning via [LyCORIS](https://github.com/KohakuBlueleaf/LyCORIS), in addition to full-rank fine-tuning.
 # Non-technical Report on Schnell De-distillation
 Welcome to my non-technical report on de-distilling FLUX.1-schnell in the most un-scientific way possible with extremely limited resources. I'm not going to claim I made a good model, but I did make a model. It was trained on about 1,500 H100 hour equivalents.
 ## Make de-distillation go fast and fit in small GPUs
+I avoided doing any full-rank (normal, all parameters) fine-tuning at all, since FLUX is big. I trained initially with the model in int8 precision using [quanto](https://github.com/huggingface/optimum-quanto). I started with a 600 million parameter [LoKr](https://arxiv.org/abs/2309.14859), since LoKr tends to approximate full-rank fine-tuning better than LoRA. The loss was really slow to go down when I began, so after poking around the code to initialize the matrix to apply to the LoKr I settled on this function, which injects noise at a fraction of the magnitudes of the layers they apply to.
 ```py
 def approximate_normal_tensor(inp, target, scale=1.0):