deepspeed / accelerate
do u mind sharing the deepspeed/accelerate config? I tried to replicate this training but on a 8xH100 OOM :D
This model was trained on a single H100 using QLORA. I am currently doing a LORA on 5xa100 to fix the prompting issues (this one is a mixof chatml and mistral), be careful with that setting if finetuning a similar model.
To get the 5xa100s working I used https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/deepspeed_configs/zero3_bf16.json and enabled the zero init (it asks you in the accelerate config dialog), nothing else special.
I tried doing it on 8x 40gb a100s out of curiosity and could not get any configuration of that to work.
I also post a bit about it here: https://twitter.com/Alice_comfy/status/1756675929181467098
Confirmed. Works, 0.4.0