kawine commited on
Commit
7187211
·
1 Parent(s): 9c2f6c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -33
README.md CHANGED
@@ -26,39 +26,6 @@ It is a FLAN-T5-xl model (3B parameters) finetuned on:
26
  1. The [Stanford Human Preferences Dataset (SHP)](https://huggingface.co/datasets/stanfordnlp/SHP), which contains aggregate human preferences sourced from 18 different communities on Reddit (e.g., `askculinary`, `legaladvice`, etc.)
27
  2. The helpfulness data in [Anthropic's HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) dataset.
28
 
29
- ## Training and Evaluation
30
-
31
- SteamSHP was only finetuned on 125K of the 392K training examples that were available, since we found that:
32
- 1. When the total input length exceeded the limit (512 tokens), the loss would not converge.
33
- When possible, we crammed an example into 500 tokens by truncating the context as much as possible, though some examples would still not fit.
34
- 2. Training on fewer preferences with a stronger signal led to better performance than training on all the preferences.
35
- From the SHP dataset, we only used preferences where the more preferred comment was twice as preferred as the other (i.e., `score_ratio` >= 2) and used no more than 5 preferences from each context (i.e., `post_id`) to prevent ovefitting.
36
-
37
- We evaluated the model on the SHP and HH-RLHF test data using accuracies, but only on the data that could be truncated to fit within 500 tokens (a total of 18621 examples).
38
- SteamSHP gets an average 72.8% accuracy across all domains:
39
-
40
- | Domain | Accuracy |
41
- | ------ | -------- |
42
- | askculinary | 0.7199 |
43
- | askhr | 0.7743 |
44
- | askdocs | 0.7210 |
45
- | askanthropology | 0.7594 |
46
- | asksciencefiction | 0.7283 |
47
- | askacademia | 0.7442 |
48
- | askengineers | 0.7183 |
49
- | legaladvice | 0.8068 |
50
- | explainlikeimfive | 0.7392 |
51
- | askbaking | 0.6741 |
52
- | askphysics | 0.8000 |
53
- | askscience | 0.7114 |
54
- | askphilosophy | 0.6907 |
55
- | askvet | 0.7742 |
56
- | changemyview | 0.7043 |
57
- | askcarguys | 0.7568 |
58
- | askhistorians | 0.7476 |
59
- | asksocialscience | 0.7308 |
60
- | anthropic (helpfulness) | 0.7310 |
61
- | ALL | 0.7278 |
62
 
63
  ## Usage
64
 
@@ -96,6 +63,42 @@ The output generated by SteamSHP will either be `A` or `B`.
96
  If the input exceeds the 512 token limit, you can use [pybsd](https://github.com/nipunsadvilkar/pySBD) to break the input up into sentences and only include that fits into 512 tokens.
97
 
98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  ### Biases and Limitations
100
 
101
  Biases in the datasets used to train SteamSHP may be propagated downstream to the model predictions.
 
26
  1. The [Stanford Human Preferences Dataset (SHP)](https://huggingface.co/datasets/stanfordnlp/SHP), which contains aggregate human preferences sourced from 18 different communities on Reddit (e.g., `askculinary`, `legaladvice`, etc.)
27
  2. The helpfulness data in [Anthropic's HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf) dataset.
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ## Usage
31
 
 
63
  If the input exceeds the 512 token limit, you can use [pybsd](https://github.com/nipunsadvilkar/pySBD) to break the input up into sentences and only include that fits into 512 tokens.
64
 
65
 
66
+ ## Training and Evaluation
67
+
68
+ SteamSHP was only finetuned on 125K of the 392K training examples that were available, since we found that:
69
+ 1. When the total input length exceeded the limit (512 tokens), the loss would not converge.
70
+ When possible, we crammed an example into 500 tokens by truncating the context as much as possible, though some examples would still not fit.
71
+ 2. Training on fewer preferences with a stronger signal led to better performance than training on all the preferences.
72
+ From the SHP dataset, we only used preferences where the more preferred comment was twice as preferred as the other (i.e., `score_ratio` >= 2) and used no more than 5 preferences from each context (i.e., `post_id`) to prevent ovefitting.
73
+
74
+ We evaluated the model on the SHP and HH-RLHF test data using accuracies, but only on the data that could be truncated to fit within 500 tokens (a total of 18621 examples).
75
+ SteamSHP gets an average 72.8% accuracy across all domains:
76
+
77
+ | Domain | Accuracy |
78
+ | ------ | -------- |
79
+ | askculinary | 0.7199 |
80
+ | askhr | 0.7743 |
81
+ | askdocs | 0.7210 |
82
+ | askanthropology | 0.7594 |
83
+ | asksciencefiction | 0.7283 |
84
+ | askacademia | 0.7442 |
85
+ | askengineers | 0.7183 |
86
+ | legaladvice | 0.8068 |
87
+ | explainlikeimfive | 0.7392 |
88
+ | askbaking | 0.6741 |
89
+ | askphysics | 0.8000 |
90
+ | askscience | 0.7114 |
91
+ | askphilosophy | 0.6907 |
92
+ | askvet | 0.7742 |
93
+ | changemyview | 0.7043 |
94
+ | askcarguys | 0.7568 |
95
+ | askhistorians | 0.7476 |
96
+ | asksocialscience | 0.7308 |
97
+ | anthropic (helpfulness) | 0.7310 |
98
+ | ALL | 0.7278 |
99
+
100
+
101
+
102
  ### Biases and Limitations
103
 
104
  Biases in the datasets used to train SteamSHP may be propagated downstream to the model predictions.