Update README.md
Browse files
README.md
CHANGED
@@ -45,10 +45,13 @@ model = AutoModelForMaskedLM.from_pretrained('Synthyra/ESMplusplus_small', trust
|
|
45 |
### Comparison across floating-point precision and implementations
|
46 |
We measured the difference of the last hidden states of the fp32 weights vs. fp16 or bf16. We find that the fp16 is closer to the fp32 outputs, so we recommend loading in fp16.
|
47 |
Please note that Evolutionary Scale loads ESMC in bf16 by default, which has its share of advantages and disadvantages in inference / training - so load whichever you like for half precision.
|
|
|
48 |
Average MSE FP32 vs. FP16: 0.00000003
|
|
|
49 |
Average MSE FP32 vs. BF16: 0.00000140
|
50 |
|
51 |
We also measured the difference between the outputs of ESM++ vs. ESMC (both in bfloat16) on 1000 random sequences to ensure compliance with the ESM package.
|
|
|
52 |
Average MSE of last hidden state: 7.74e-10
|
53 |
|
54 |
You can load the weights from the ESM package instead of transformers by replacing .from_pretrained(...) to .from_pretrained_esm('esmc_300m')
|
|
|
45 |
### Comparison across floating-point precision and implementations
|
46 |
We measured the difference of the last hidden states of the fp32 weights vs. fp16 or bf16. We find that the fp16 is closer to the fp32 outputs, so we recommend loading in fp16.
|
47 |
Please note that Evolutionary Scale loads ESMC in bf16 by default, which has its share of advantages and disadvantages in inference / training - so load whichever you like for half precision.
|
48 |
+
|
49 |
Average MSE FP32 vs. FP16: 0.00000003
|
50 |
+
|
51 |
Average MSE FP32 vs. BF16: 0.00000140
|
52 |
|
53 |
We also measured the difference between the outputs of ESM++ vs. ESMC (both in bfloat16) on 1000 random sequences to ensure compliance with the ESM package.
|
54 |
+
|
55 |
Average MSE of last hidden state: 7.74e-10
|
56 |
|
57 |
You can load the weights from the ESM package instead of transformers by replacing .from_pretrained(...) to .from_pretrained_esm('esmc_300m')
|