AINovice2005
/

ElEmperador

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ElEmperador / README.md

AINovice2005's picture

Update README.md

4f74973 verified 4 months ago

|

821 Bytes

	---
	license: apache-2.0
	datasets:
	- argilla/ultrafeedback-binarized-preferences-cleaned
	language:
	- en
	base_model:
	- mistralai/Mistral-7B-v0.1
	library_name: transformers
	tags:
	- transformers
	- ORPO
	- RLHF
	- notus
	- argilla
	---

	# Model Overview

	# 𝐌𝐨𝐝𝐞𝐥 𝐍𝐚𝐦𝐞:ElEmperador

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e8ea3892d9db9a93580fe3/gkDcpIxRCjBlmknN_jzWN.png)


	## Model Description:

	ElEmperador is an ORPO-based finetune derived from the Mistral-7B-v0.1 base model.


	## Evals:
	BLEU:0.209



	## Results

	Firstly,ORPO is a viable RLHF algorithm to improve the performance of your models along with SFT finetuning.Secondly, it also helps in aligning the model’s outputs more closely with human preferences,
	leading to more user-friendly and acceptable results.