AINovice2005
/

ElEmperador

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AINovice2005 commited on Oct 5, 2024

Commit

bf10ea8

·

verified ·

1 Parent(s): 024f82b

Update README.md

Files changed (1) hide show

README.md +7 -8

README.md CHANGED Viewed

@@ -21,17 +21,11 @@ tags:
 ElEmperador is an ORPO-based finetinue derived from the Mistral-7B-v0.1 base model.
-The argilla/ultrafeedback-binarized-preferences-cleaned dataset was used, albeit a small portion was used due to GPU constraints.
 # Evals:
 BLEU:0.209
-# Conclusion
-ORPO is a viable RLHF algorithm to improve the performance of your models than SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
-leading to more user-friendly and acceptable results.
 ## Inference Script:
 ```python
@@ -62,4 +56,9 @@ if __name__ == "__main__":
     print(f"Input: {input_text}")
     print(f"Output: {output}")
-```

 ElEmperador is an ORPO-based finetinue derived from the Mistral-7B-v0.1 base model.
+The argilla/ultrafeedback-binarized-preferences-cleaned dataset was used for training, albeit a small portion was used due to GPU constraints.
 # Evals:
 BLEU:0.209
 ## Inference Script:
 ```python
     print(f"Input: {input_text}")
     print(f"Output: {output}")
+```
+# Results
+ORPO is a viable RLHF algorithm to improve the performance of your models along with SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
+leading to more user-friendly and acceptable results.