Safetensors
English
olmo2

Checkpoint pre training data

#7
by cobquintero - opened

I am aware that the checkpoints after stage 1 and stage 2 you know all the data that has been seen in training. Is there a way to know after every checkpoint what specific data files the model has seen? e.g. for stage1-step286000-tokens1200B is there a way to see the datafiles the model has seen up to that point?

Sign up or log in to comment