AINovice2005
/

ElEmperador

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AINovice2005 commited on Oct 8, 2024

Commit

1eaa66d

·

verified ·

1 Parent(s): f56be9a

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -23,7 +23,6 @@ tags:
 ElEmperador is an ORPO-based finetune derived from the Mistral-7B-v0.1 base model.
-The 'ultrafeedback-binarized-preferences-cleaned' dataset was used for training, albeit a small portion was used due to GPU constraints.
 ## Evals:
 BLEU:0.209
@@ -62,5 +61,6 @@ if __name__ == "__main__":
 ## Results
-ORPO is a viable RLHF algorithm to improve the performance of your models along with SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
 leading to more user-friendly and acceptable results.

 ElEmperador is an ORPO-based finetune derived from the Mistral-7B-v0.1 base model.
 ## Evals:
 BLEU:0.209
 ## Results
+Firstly,ORPO is a viable RLHF algorithm to improve the performance of your models along with SFT finetuning.Secondly, it also helps in aligning the model’s outputs more closely with human preferences,
 leading to more user-friendly and acceptable results.