File size: 821 Bytes

7dac834
 
 
 
 
 
 
 
 
 
 
5ebd31b
 
 
 
bc769af
 
fdb9a0a
59729a1
2e499ac
59729a1
23d2bb9
 
 
59729a1
bc769af
f56be9a
bc769af
d6ab674
59729a1
1bee25d
b03f230
d625ec5
bf10ea8
59729a1
bf10ea8
1eaa66d
5ebd31b

---
license: apache-2.0
datasets:
- argilla/ultrafeedback-binarized-preferences-cleaned
language:
- en
base_model:
- mistralai/Mistral-7B-v0.1
library_name: transformers
tags:
- transformers
- ORPO
- RLHF
- notus
- argilla
---

#  Model Overview

# 𝐌𝐨𝐝𝐞𝐥 𝐍𝐚𝐦𝐞:ElEmperador

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e8ea3892d9db9a93580fe3/gkDcpIxRCjBlmknN_jzWN.png)


## Model Description:

ElEmperador is an ORPO-based finetune derived from the Mistral-7B-v0.1 base model.


## Evals:
BLEU:0.209



## Results

Firstly,ORPO is a viable RLHF algorithm to improve the performance of your models along with SFT finetuning.Secondly, it also helps in aligning the model’s outputs more closely with human preferences,
leading to more user-friendly and acceptable results.