lhallee commited on
Commit
4876165
·
verified ·
1 Parent(s): fc7d12b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -45,10 +45,13 @@ model = AutoModelForMaskedLM.from_pretrained('Synthyra/ESMplusplus_small', trust
45
  ### Comparison across floating-point precision and implementations
46
  We measured the difference of the last hidden states of the fp32 weights vs. fp16 or bf16. We find that the fp16 is closer to the fp32 outputs, so we recommend loading in fp16.
47
  Please note that Evolutionary Scale loads ESMC in bf16 by default, which has its share of advantages and disadvantages in inference / training - so load whichever you like for half precision.
 
48
  Average MSE FP32 vs. FP16: 0.00000003
 
49
  Average MSE FP32 vs. BF16: 0.00000140
50
 
51
  We also measured the difference between the outputs of ESM++ vs. ESMC (both in bfloat16) on 1000 random sequences to ensure compliance with the ESM package.
 
52
  Average MSE of last hidden state: 7.74e-10
53
 
54
  You can load the weights from the ESM package instead of transformers by replacing .from_pretrained(...) to .from_pretrained_esm('esmc_300m')
 
45
  ### Comparison across floating-point precision and implementations
46
  We measured the difference of the last hidden states of the fp32 weights vs. fp16 or bf16. We find that the fp16 is closer to the fp32 outputs, so we recommend loading in fp16.
47
  Please note that Evolutionary Scale loads ESMC in bf16 by default, which has its share of advantages and disadvantages in inference / training - so load whichever you like for half precision.
48
+
49
  Average MSE FP32 vs. FP16: 0.00000003
50
+
51
  Average MSE FP32 vs. BF16: 0.00000140
52
 
53
  We also measured the difference between the outputs of ESM++ vs. ESMC (both in bfloat16) on 1000 random sequences to ensure compliance with the ESM package.
54
+
55
  Average MSE of last hidden state: 7.74e-10
56
 
57
  You can load the weights from the ESM package instead of transformers by replacing .from_pretrained(...) to .from_pretrained_esm('esmc_300m')