Difference between SFT and init models

by HyeongSoo - opened 7 days ago

7 days ago

I was deeply impressed by your minillm paper and I’m reproducing it using the models you released.
Could you tell me the difference between the sft and init models? They both appear to be supervised fine-tuned models, so I’m wondering if there’s any difference.

t1101675

MiniLLM org 5 days ago

Hi! Thanks for your attention to our work!

The SFT and init models are both supervised fine-tuned models. The difference is that, the SFT models are the checkpoints achieving the highest Rouge-L scores on the validation set at the end of each epoch while the init models achieve the lowest validation loss. We find that using the validation losses to select init models is better for further RL-like MiniLLM training.
More details can be found in Appendix B.1 in our paper.

HyeongSoo

5 days ago

Thank you for your kind answer.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment