Why an instruct model may not be ideal for further fine-tuning?
#8
by
LucienShui
- opened
Is there any reference?
Unfortunately, I don't have a direct reference from the top of my head. But let's consider three different methods for finetuning:
a: Finetune(new_data)
b: Finetune(instructions and new_data)
c: Finetune(instructions then new_data) <--- starting from instruct model
I would expect either a
or b
to yield the best results depending on the task, although there is no guarantee. So starting from the base model gives you the best flexibility to try all of the different approaches and see what works best for you.
I originally thought that less data meant less computing resources. Your point of view is also a valuable perspective. Thanks a lot. :)
LucienShui
changed discussion status to
closed