LxzGordon
/

URM-LLaMa-3.1-8B

Text Classification

Model card Files Files and versions Community

LxzGordon commited on Sep 12, 2024

Commit

3310230

·

verified ·

1 Parent(s): 4eb7474

Update README.md

Files changed (1) hide show

README.md +7 -3

README.md CHANGED Viewed

@@ -5,7 +5,11 @@ datasets:
 pipeline_tag: text-classification
 ---
-**Paper:** Coming soon
 # Architecture
 <div align=center>
@@ -15,7 +19,7 @@ URM is one of the RMs in the figure.
 # Brief
-[URM-llama3.1-8B](https://huggingface.co/LxzGordon/URM-llama3.1-8B) is an uncertain-aware reward model.
 This RM consists of a base model and an uncertainty-aware and attribute-specific value head. The base model of this RM is from [Skywork-Reward-Llama-3.1-8B](https://huggingface.co/Skywork/Skywork-Reward-Llama-3.1-8B).
 URM involves two-stage training: 1. **attributes regression** and 2. **gating layer learning**.
@@ -39,7 +43,7 @@ During this process, the value head and base model are kept frozen.
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
-model_name = "LxzGordon/URM-llama3.1-8B"
 model = AutoModelForSequenceClassification.from_pretrained(
     model_name,
     device_map='auto',

 pipeline_tag: text-classification
 ---
+- **Paper:** Coming soon
+- **Model:** [URM-LLaMa-3.1-8B](https://huggingface.co/LxzGordon/URM-LLaMa-3.1-8B)
+  - Fine-tuned from [Skywork-Reward-Llama-3.1-8B](https://huggingface.co/Skywork/Skywork-Reward-Llama-3.1-8B)
 # Architecture
 <div align=center>
 # Brief
+[URM-LLaMa-3.1-8B](https://huggingface.co/LxzGordon/URM-LLaMa-3.1-8B) is an uncertain-aware reward model.
 This RM consists of a base model and an uncertainty-aware and attribute-specific value head. The base model of this RM is from [Skywork-Reward-Llama-3.1-8B](https://huggingface.co/Skywork/Skywork-Reward-Llama-3.1-8B).
 URM involves two-stage training: 1. **attributes regression** and 2. **gating layer learning**.
 import torch
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
+model_name = "LxzGordon/URM-LLaMa-3.1-8B"
 model = AutoModelForSequenceClassification.from_pretrained(
     model_name,
     device_map='auto',