Fine-tuning question

#1
by Isotonic - opened

should i use DataCollatorForPermutationLanguageModeling when trying to finetune this model?

Is the finetuning process similar to XLnet with sMASK and gMASK added
or
Do we have to use AutoModelForSeq2SeqLM (below code is from the glm repo)

inputs = tokenizer(
    ["Tsinghua University is located in [MASK].", "One minus one equals zero, is it correct? Answer: [MASK]"],
    return_tensors="pt", padding=True)
inputs = tokenizer.build_inputs_for_generation(inputs, targets=["Beijing", "No"], max_gen_length=8, padding=False)
inputs = inputs.to('cuda')
outputs = model(**inputs)
loss = outputs.loss
logits = outputs.logits```

Do you have any open-ended generation finetuning setup?

Sign up or log in to comment