stanfordnlp
/

SteamSHP-flan-t5-xl

@@ -26,39 +26,6 @@ It is a FLAN-T5-xl model (3B parameters) finetuned on:
 1. The [Stanford Human Preferences Dataset (SHP)](https://huggingface.co/datasets/stanfordnlp/SHP), which contains aggregate human preferences sourced from 18 different communities on Reddit (e.g., `askculinary`, `legaladvice`, etc.)
 2. The helpfulness data in [Anthropic's HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) dataset.
-## Training and Evaluation
-SteamSHP was only finetuned on 125K of the 392K training examples that were available, since we found that:
-1. When the total input length exceeded the limit (512 tokens), the loss would not converge.
-   When possible, we crammed an example into 500 tokens by truncating the context as much as possible, though some examples would still not fit.
-2. Training on fewer preferences with a stronger signal led to better performance than training on all the preferences.
-   From the SHP dataset, we only used preferences where the more preferred comment was twice as preferred as the other (i.e., `score_ratio` >= 2) and used no more than 5 preferences from each context (i.e., `post_id`) to prevent ovefitting.
-We evaluated the model on the SHP and HH-RLHF test data using accuracies, but only on the data that could be truncated to fit within 500 tokens (a total of 18621 examples).
-SteamSHP gets an average 72.8% accuracy across all domains:
-| Domain | Accuracy |
-| ------ | -------- |
-| askculinary | 0.7199 |
-| askhr | 0.7743 |
-| askdocs | 0.7210 |
-| askanthropology | 0.7594 |
-| asksciencefiction | 0.7283 |
-| askacademia | 0.7442 |
-| askengineers | 0.7183 |
-| legaladvice | 0.8068 |
-| explainlikeimfive | 0.7392 |
-| askbaking | 0.6741 |
-| askphysics | 0.8000 |
-| askscience | 0.7114 |
-| askphilosophy | 0.6907 |
-| askvet | 0.7742 |
-| changemyview | 0.7043 |
-| askcarguys | 0.7568 |
-| askhistorians | 0.7476 |
-| asksocialscience | 0.7308 |
-| anthropic (helpfulness) | 0.7310 |
-| ALL | 0.7278 |
 ## Usage
@@ -96,6 +63,42 @@ The output generated by SteamSHP will either be `A` or `B`.
 If the input exceeds the 512 token limit, you can use [pybsd](https://github.com/nipunsadvilkar/pySBD) to break the input up into sentences and only include that fits into 512 tokens.
 ### Biases and Limitations
 Biases in the datasets used to train SteamSHP may be propagated downstream to the model predictions.

 1. The [Stanford Human Preferences Dataset (SHP)](https://huggingface.co/datasets/stanfordnlp/SHP), which contains aggregate human preferences sourced from 18 different communities on Reddit (e.g., `askculinary`, `legaladvice`, etc.)
 2. The helpfulness data in [Anthropic's HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) dataset.
 ## Usage
 If the input exceeds the 512 token limit, you can use [pybsd](https://github.com/nipunsadvilkar/pySBD) to break the input up into sentences and only include that fits into 512 tokens.
+## Training and Evaluation
+SteamSHP was only finetuned on 125K of the 392K training examples that were available, since we found that:
+1. When the total input length exceeded the limit (512 tokens), the loss would not converge.
+   When possible, we crammed an example into 500 tokens by truncating the context as much as possible, though some examples would still not fit.
+2. Training on fewer preferences with a stronger signal led to better performance than training on all the preferences.
+   From the SHP dataset, we only used preferences where the more preferred comment was twice as preferred as the other (i.e., `score_ratio` >= 2) and used no more than 5 preferences from each context (i.e., `post_id`) to prevent ovefitting.
+We evaluated the model on the SHP and HH-RLHF test data using accuracies, but only on the data that could be truncated to fit within 500 tokens (a total of 18621 examples).
+SteamSHP gets an average 72.8% accuracy across all domains:
+| Domain | Accuracy |
+| ------ | -------- |
+| askculinary | 0.7199 |
+| askhr | 0.7743 |
+| askdocs | 0.7210 |
+| askanthropology | 0.7594 |
+| asksciencefiction | 0.7283 |
+| askacademia | 0.7442 |
+| askengineers | 0.7183 |
+| legaladvice | 0.8068 |
+| explainlikeimfive | 0.7392 |
+| askbaking | 0.6741 |
+| askphysics | 0.8000 |
+| askscience | 0.7114 |
+| askphilosophy | 0.6907 |
+| askvet | 0.7742 |
+| changemyview | 0.7043 |
+| askcarguys | 0.7568 |
+| askhistorians | 0.7476 |
+| asksocialscience | 0.7308 |
+| anthropic (helpfulness) | 0.7310 |
+| ALL | 0.7278 |
 ### Biases and Limitations
 Biases in the datasets used to train SteamSHP may be propagated downstream to the model predictions.