How much VRAM ?

by tmanzz - opened

How much VRAN is meant to be used? I am seeing close to 10GB!
Any optimization recommendations?

I had a look into this - but I can't make it work. My GPU is T4 (16GB).

This is how my pipeline looks like:

            quantization_config = BitsAndBytesConfig(load_in_8bit=True)
            text_encoder = T5EncoderModel.from_pretrained(
            self.pipeline = StableDiffusion3Pipeline.from_pretrained(model_name, cache_dir="./cache", 

Any suggestions?

Error I get is e.g.:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 252.00 MiB. GPU

Works fine on my old GTX 1080 with 8 GB VRAM.

pipeline = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", text_encoder_3=None, tokenizer_3=None,  torch_dtype=torch.float16).to('cuda')

@KernelDebugger - that works! Genius! I wonder what the text_encoder and tokenizer do if they can be set to 'None'?

@KernelDebugger "Dropping the T5 Text Encoder" didn't work for me. I mix it with "Model Offloading", and it works for 8GB GTX 1070 Ti.

pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", text_encoder_3=None, tokenizer_3=None, torch_dtype=torch.float16)

@KernelDebugger - that works! Genius! I wonder what the text_encoder and tokenizer do if they can be set to 'None'?

They are optional. There is more lightweight encoder in this model, pipline falls back to using it when text_encoder_3=None

Sign up or log in to comment