I guess there is issues with the token_type_ids
I had to edit grpo_trainer in line 421 to make it work.
if "token_type_ids" in prompt_inputs: del prompt_inputs["token_type_ids"]
· Sign up or log in to comment