GGUF Version?

#1
by dasilva333 - opened

Hey guys so I tried converting the safetensors to gguf format so I can use it in my ComfyUI workflow using this project:

https://github.com/ruSauron/to-gguf-bat

Unfortunately it's not working I'm getting this error with ComfyUI using the UNET GGUF LOADER node.

RuntimeError: The expanded size of the tensor (3072) must match the existing size (3264) at non-singleton dimension 1.

Can anyone provide me any insight or advice to properly converting it? I've done like 4 other models and I've never had problems with any other model before.

Terminus Research Group org

did you convert it to comfyUI type first? the bfl formatted weights fuse the qkv and diffusers does not. but i'm not sure if that's where that shape error comes from.

i don't use ComfyUI or GGUF, so i can't really provide much useful feedback there

Terminus Research Group org

it's worth noting that this is a transformer model and not a u-net, so maybe that is the wrong loader to use.

Just want to reiterate that I've previously been able to convert 4 other models and the .safetensors versions to .gguf with that script without an issue I'm only having trouble with this one.

It would be helpful if you can provide any insight as to any non-standard thing that was done to this model to help me figure out how I could potentially fix it with the script.

Terminus Research Group org

nothing - it is the same exact Flux architecture

Just wanted to provide some additional insight, I tried converting the model to q_51 and now I get this error message

The expanded size of the tensor (3072) must match the existing size (2304) at non-singleton dimension 1. Target sizes: [3072, 3072]. Tensor sizes: [3072, 2304]

so Q_8's tensor size on the 2nd position was 3264 and for Q_51 it's 2304. It's off by 192 (3072 vs 3264) maybe there's a way to adjust this tool to handle this shape?

https://github.com/leejet/stable-diffusion.cpp/releases

Sorry to interrupt.
If you want a file in ComfyUI format, is this one in Civitai different?

FluxBooru
https://civitai.com/models/859032/fluxbooru?modelVersionId=961597

Not interrupting i appreciate your suggestion, I actually found this models thanks to CivitAI. You can tell from the hashes that linked model is just a repost of this one and the same format.

My goal being to quant size it to run faster on my hardware

@bghira i wish you would elaborate on what you meant by converting to ComfyUi format first

Let me explain on behalf of bghira: the Diffusers format (this repo) and the ComfyUI format are the same safetensors file, but the contents are actually quite different. Specifically, the names and structure of the keys are different, kind of like zip and rar. Simply loading them will not work. (There are ComfyUI node that reads the Diffusers format as is. I'm not familiar with the ComfyUI tool itself...)
https://github.com/maepopi/Diffusers-in-ComfyUI
So unless you convert it properly first, you won't get the format you want, and you won't get the quantized file.
I saw a conversion script somewhere, I'll see if I can find it.

Edit:
I tracked down the location of the repo with the conversion code, but it's gone...
I downloaded the conversion code for myself, but it would be bad if I re-uploaded it without permission...
https://huggingface.co/twodgirl/flux-devpro-schnell-merge-fp8-e4m3fn-diffusers

Edit:
Can't you just download Civitai's and quantize it?

I think I found it, I asked o1 preview about what you said and it linked me to this

https://github.com/huggingface/diffusers/blob/main/scripts/convert_diffusers_to_original_stable_diffusion.py

It’s the official conversion script, so I’ll try it when I get home and then I’ll try to then convert the new safetensors to gguf thanks for your help I appreciate the insight

https://github.com/huggingface/diffusers/blob/main/scripts/convert_diffusers_to_original_stable_diffusion.py

That's a script specifically for SD 1.5. There are two others, one dedicated to SDXL and one dedicated to SDXL LoRA proprietary format, but those are the three official ones, all three, and no other...
It's easier to do the work for ComfyUI with ComfyUI, unless you want to do it with CLI.
It seems that even with just the nodes built into the official ComfyUI, the Diffusers format itself can be recognized, and image generation itself seems to be possible if you simply merge them. However, it will probably go wrong somewhere, so it is better to use a node like the one above that directly supports Flux.
I don't know how the Civitai people did the conversion, but perhaps they used this method?
https://github.com/Limitex/ComfyUI-Diffusers
https://github.com/maepopi/Diffusers-in-ComfyUI

So to be clear you don’t think that script will work moreover I have to find a script that’ll specifically convert Diffusers format to ComfyUI aka StableDiffusion format that’s dedicated to flux?

Just to be clear the issue isn’t having the safetensors file work in comfy it’s just reducing the size of the safetensor so it’ll run faster on my comfy setup.

As for civitai it’s just a repo of files nothing special, they’re just hosting the same file found here

Any chance you can send me that mythical script you have via DM or email?

It's a pain in the ass, so I uploaded it.
https://huggingface.co/John6666/safetensors_converting_test/tree/main/twodgirl-script

This script only converts the transformer part, but the rest is in the same format as ComfyUI, so don't worry about it.

Awesome so I read your README and it seems the script I’m looking for is the flux_devpro_ckpt.py and I’ll follow up here on whether I got it to work.

Thanks John, also I appreciate your work on JoyCaption much loved in the community thanks again

Thank you. By the way, I think this script would be useful for the conversion to GGUF. The author is Japanese like me, so the README is in Japanese, but it's easy to use, so I don't think you'll get lost in the automatic translation.
https://github.com/Zuntan03/EasyForge/tree/main/flux_tool
https://huggingface.co/Zuntan

As for NF4, a general script to convert safetensors files to NF4 would be sufficient. If you don't have a good one, ask lllyasviel to make sure.

@John6666 That worked great thanks man!

to be clear your project needed a few adjustments I'll outline them below

import os
import sys
import gc
from twodgirl_script_map_from_diffusers import convert_diffusers_to_flux_checkpoint <- replaced hyphen in filenames with underscore to fix module solution issues
from safetensors.torch import load_file, save_file
import torch

# Add the current directory to sys.path
sys.path.append(os.path.dirname(os.path.abspath(__file__))) <- I added this and I also created an empty __init__.py file in the same directory

###
# Code from huggingface/twodgirl
# License: apache-2.0

if __name__ == '__main__':
    sd = convert_diffusers_to_flux_checkpoint(load_file(sys.argv[1]))
    # assert sd['time_in.in_layer.weight'].dtype == torch.float8_e4m3fn <- disabled the assertion I don't know what layer.weight d_type fluxbooru is but it's just an assertion not a big deal
    print(len(sd))
    gc.collect()
    save_file(sd, sys.argv[2])

ran the command like this:

(llm_env) E:\gguf-converter\john666>python twodgirl_script_flux_devpro_ckpt.py fluxbooru_v02CFGFullcheckpoint.safetensors fluxbooru_v02CFGFullcheckpoint_converted.safetensors

that created a the _converted file of similar size, i then moved that over to the gguf-converter project I mentioned in the first post and converted it to Q_8 and I can confirm it works without any issue

Thanks for sharing your know-how. Glad it worked out.
I'll fix that before I forget.
That assertion is a mystery, but I thought it might have prevented it from going OOM when working with files of any size other than fp8. I'm thinking that if you have plenty of memory, you don't need it.

Edit:
I just re-uploaded it...I forgot that using subdirectories in HF makes the filenames look weird when downloading in a browser...
https://huggingface.co/datasets/John6666/twodgirl_diffusers_to_flux_script

very good that'll help others get started more easily thanks for that

Just wanted to correct my earlier statements, I thought ptx0 just reposted the CFG 3.5 models but it turns out they're different completely, not sure how I got confused with the hashes. Anyways I will be posting 3 version on my profile here, I'd also like to understand what the difference is between the CFG 3.5 model and the v0.3 models created under the same name, I'm thinking the v0.3 is a derivative of the CFG 3.5 model but if anyone can provide any insight on that it would appreciated and I would use that to update the model's page to explain that to others.

https://civitai.com/models/867689?modelVersionId=971049

the 3 versions will be
ptx0's v0.2 and v0.3 model
terminusresearch's CFG 3.5 model
I will posting all 3 in GGUF Q_8 format

I'm not talking about good or bad, I'm simply surprised that a forked version that inherits the version number can be uploaded elsewhere.🤔
I think it is unusual.

By the way, the reason why NF4 and GGUF quantized versions are or are not on Civitai was probably because quantization tools are not as popular as I thought.
I had assumed that someone had created them and made them popular.

Terminus Research Group org

terminusresearch = ptx0

terminusresearch = ptx0

I see. Then it's normal.

@bghira I also didn't know they were the same person, I assumed they we're different people. As for version numbers, it's confusing how the huggingface version is called cfg 3.5 and the civit ones start at 0.1 and go up to 0.3.

I also generated 3 images with all 3 GGUF models and shared it here: https://civitai.com/posts/8078093

Terminus Research Group org

the v0.3 model here is a merge of v0.2 and v0.2-LoKr-1.6B

it was done incorrectly and last night was reuploaded here and to CivitAI

the rest are correct images

Sign up or log in to comment