AINovice2005
/

ElEmperador

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AINovice2005 commited on Oct 4, 2024

Commit

024f82b

·

verified ·

1 Parent(s): 5a23691

Update README.md

Files changed (1) hide show

README.md +1 -7

README.md CHANGED Viewed

@@ -23,20 +23,14 @@ ElEmperador is an ORPO-based finetinue derived from the Mistral-7B-v0.1 base mod
 The argilla/ultrafeedback-binarized-preferences-cleaned dataset was used, albeit a small portion was used due to GPU constraints.
-## Citation
-[Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, May 23). ] https://arxiv.org/abs/2305.14314.
 # Evals:
 BLEU:0.209
-# Conclusion and Model Recipe.
 ORPO is a viable RLHF algorithm to improve the performance of your models than SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
 leading to more user-friendly and acceptable results.
-The model recipe: [https://github.com/ParagEkbote/El-Emperador_ModelRecipe]
 ## Inference Script:

 The argilla/ultrafeedback-binarized-preferences-cleaned dataset was used, albeit a small portion was used due to GPU constraints.
 # Evals:
 BLEU:0.209
+# Conclusion
 ORPO is a viable RLHF algorithm to improve the performance of your models than SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
 leading to more user-friendly and acceptable results.
 ## Inference Script: