File size: 1,821 Bytes
7dac834 046d5ab bc769af fdb9a0a 59729a1 e6e4da2 59729a1 23d2bb9 59729a1 bc769af f56be9a bc769af d6ab674 59729a1 1bee25d b03f230 d6ab674 d625ec5 a11389d 2418f79 d625ec5 bc769af d625ec5 bc769af d625ec5 bc769af d625ec5 bc769af 2418f79 bc769af d625ec5 bc769af d625ec5 bc769af bf10ea8 59729a1 bf10ea8 1eaa66d bf10ea8 1eaa66d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
---
license: apache-2.0
datasets:
- argilla/ultrafeedback-binarized-preferences-cleaned
language:
- en
base_model:
- mistralai/Mistral-7B-v0.1
library_name: transformers
tags:
- transformers
---
# Model Overview
- ππ¨πππ₯ πππ¦π:ElEmperador
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e8ea3892d9db9a93580fe3/gkDcpIxRCjBlmknN_jzWN.png)
## Model Description:
ElEmperador is an ORPO-based finetune derived from the Mistral-7B-v0.1 base model.
## Evals:
BLEU:0.209
## Inference Script:
```python
def generate_response(model_name, input_text, max_new_tokens=50):
# Load the tokenizer and model from Hugging Face Hub
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Tokenize the input text
input_ids = tokenizer(input_text, return_tensors='pt').input_ids
# Generate a response using the model
with torch.no_grad():
generated_ids = model.generate(input_ids, max_new_tokens=max_new_tokens)
# Decode the generated tokens into text
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
return generated_text
if __name__ == "__main__":
# Set the model name from Hugging Face Hub
model_name = "AINovice2005/ElEmperador"
input_text = "Hello, how are you?"
# Generate and print the model's response
output = generate_response(model_name, input_text)
print(f"Input: {input_text}")
print(f"Output: {output}")
```
## Results
Firstly,ORPO is a viable RLHF algorithm to improve the performance of your models along with SFT finetuning.Secondly, it also helps in aligning the modelβs outputs more closely with human preferences,
leading to more user-friendly and acceptable results.
|