kawine commited on
Commit
8f17ff5
·
1 Parent(s): ee33554

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -101,7 +101,7 @@ Therefore you may want to normalize the probability.
101
 
102
  You can also compare the two probabilities assigned independently to each response (given the same context) to infer the preference label.
103
  For example, if one response has probability 0.95 and the other has 0.80, the former will be preferred.
104
- Inferring the preference label in this way only leads to a 0.5 drop in accuracy on the SHP + HH-RLHF test data on average across all domains, meaning that there's only a very small penalty for using SteamSHP as a reward model instead of as a preference model.
105
 
106
 
107
 
@@ -142,7 +142,7 @@ SteamSHP-XL gets an average 72.8% accuracy across all domains:
142
  | ALL (unweighted) | 0.7278 |
143
 
144
  As mentioned previously, if you use SteamSHP as a reward model and try to infer the preference label based on the probability assigned to each response independently, that could also work!
145
- But doing so will lead to a 0.5 drop in accuracy on the test data (on average across all domains), meaning that there is a small penalty.
146
 
147
 
148
  ## Biases and Limitations
 
101
 
102
  You can also compare the two probabilities assigned independently to each response (given the same context) to infer the preference label.
103
  For example, if one response has probability 0.95 and the other has 0.80, the former will be preferred.
104
+ Inferring the preference label in this way only leads to a 0.006 drop in accuracy on the SHP + HH-RLHF test data on average across all domains, meaning that there's only a very small penalty for using SteamSHP-XL as a reward model instead of as a preference model.
105
 
106
 
107
 
 
142
  | ALL (unweighted) | 0.7278 |
143
 
144
  As mentioned previously, if you use SteamSHP as a reward model and try to infer the preference label based on the probability assigned to each response independently, that could also work!
145
+ But doing so will lead to a 0.006 drop in accuracy on the test data (on average across all domains), meaning that there is a small penalty.
146
 
147
 
148
  ## Biases and Limitations