Does the model support Float16 and Bfloat16 precision?

#12
by AoZhang - opened

Thanks for the excellent work!
The model can work well at float32. When I try to convert the model weights to float16/bfloat16, the model output becomes weird. Is this normal?

The base model (olmoe) was trained in bf16 so I think this one should also work in bf16? Maybe @chrisc36 knows more

I've successfully used it with load_in_4bit=True. Didn't measure performance drop though.

Having the same problem, getting nonsense outputs when i load the model in float16 and bfloat16

Hey @etoml I have recreated this issue and yes the model gives wrong outputs when the model is loaded in float16/bf16.
My investigation showed significant precision loss affecting model features. After debugging these issues extensively, I hypothesize that bfloat16 quantization is blurring the distinction between different visual elements and completely losing some fine-grained details.

Input dtypes before processing:
input_ids: torch.int64
images: torch.float32
image_input_idx: torch.int32
image_masks: torch.float32
Image  tensor range: [-1.792, 2.066]

Input dtypes after processing:
input_ids: torch.int64
images: torch.bfloat16
Image tensor range: [-1.789, 2.062]

Sign up or log in to comment