CARL Fine-tuning
Dear community,
What structure should the dataset have for fine-tuning?
like for example is it good to use a text data that has the following structure in a txt file?
User:
My wife and mother are at loggerheads ...
CARL:
What you're describing is what psychologists call triangulation...
.....
or the following structure is better in a csv file?
ID,Type,Utterance,Dialog_Act
27_0,T,"Okay, I want to thank you for your participation so far in this intake...
27_1,P,yeah.,gc
27_2,T,Are you adopted?,ynq
27_3,P,No.,yna
or any recommendations?
@mehdi1964 If you are planning to finetune existing Carl model then better to stick with { "from": "human", "value": "xxx...." }, { "from": "gpt", "value": "xxx..." }
@ajibawa-2023
In what file extension should I save the text (*.txt or *.csv)?
Does it matter at all?
.json or .jsonl are preferred.