Text Generation
PyTorch
English
opt

Difference between SFT and init models

#1
by HyeongSoo - opened

I was deeply impressed by your minillm paper and I’m reproducing it using the models you released.
Could you tell me the difference between the sft and init models? They both appear to be supervised fine-tuned models, so I’m wondering if there’s any difference.

MiniLLM org

Hi! Thanks for your attention to our work!

The SFT and init models are both supervised fine-tuned models. The difference is that, the SFT models are the checkpoints achieving the highest Rouge-L scores on the validation set at the end of each epoch while the init models achieve the lowest validation loss. We find that using the validation losses to select init models is better for further RL-like MiniLLM training.
More details can be found in Appendix B.1 in our paper.

Thank you for your kind answer.

Sign up or log in to comment