Huge memory consumption with SD3.5-medium
#18
by
oddball516
- opened
According to the picture here, SD3.5-medium should work fine on 10GB vRAM
https://stability.ai/news/introducing-stable-diffusion-3-5
However, my test program fails on a g4dn.xlarge AWS instance, it has 4C/16G + 48G swap, and a Tesla T4 CPU with 16GB vRAM. It runs out of memory due to CUDA couldn't allocate more memory. From nvidia-smi it already took ~15GB memory, and couldn't complete even one picture.
I'm wondering what's wrong here?
Attached fill source code.
import os
import json
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("./stable-diffusion-3.5-medium/")
if torch.cuda.is_available():
print('use cuda')
pipe = pipe.to("cuda")
elif torch.mps.is_available():
print('use mps')
pipe = pipe.to('mps')
else:
print('use cpu')
data = []
with open('data.json', 'r') as f:
data = json.load(f)
os.makedirs('output', exist_ok=True)
for row in data:
prompt = '%s, style is %s, light is %s' % (row['prompt'], row['style'], row['light'])
filename = 'output/%s.png' % (row['uuid'])
height = 1280
width = 1280
if row['aspect_ratio'] == '16:9':
width = 720
elif row['aspect_ratio'] == '9:16':
width = 720
height = 1280
print('saving', filename)
image = pipe(prompt, height=height, width=width).images[0]
image.save(filename)
did it resolve for you
@yue32000
@oddball516
The reason is because of the T5 text encoder, you can resolve it with
pipe.enable_model_cpu_offload()