Synthyra
/

ESMplusplus_small

Model card Files Files and versions Community

lhallee commited on Dec 5, 2024

Commit

4876165

·

verified ·

1 Parent(s): fc7d12b

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -45,10 +45,13 @@ model = AutoModelForMaskedLM.from_pretrained('Synthyra/ESMplusplus_small', trust
 ### Comparison across floating-point precision and implementations
 We measured the difference of the last hidden states of the fp32 weights vs. fp16 or bf16. We find that the fp16 is closer to the fp32 outputs, so we recommend loading in fp16.
 Please note that Evolutionary Scale loads ESMC in bf16 by default, which has its share of advantages and disadvantages in inference / training - so load whichever you like for half precision.
 Average MSE FP32 vs. FP16: 0.00000003
 Average MSE FP32 vs. BF16: 0.00000140
 We also measured the difference between the outputs of ESM++ vs. ESMC (both in bfloat16) on 1000 random sequences to ensure compliance with the ESM package.
 Average MSE of last hidden state: 7.74e-10
 You can load the weights from the ESM package instead of transformers by replacing .from_pretrained(...) to .from_pretrained_esm('esmc_300m')

 ### Comparison across floating-point precision and implementations
 We measured the difference of the last hidden states of the fp32 weights vs. fp16 or bf16. We find that the fp16 is closer to the fp32 outputs, so we recommend loading in fp16.
 Please note that Evolutionary Scale loads ESMC in bf16 by default, which has its share of advantages and disadvantages in inference / training - so load whichever you like for half precision.
 Average MSE FP32 vs. FP16: 0.00000003
 Average MSE FP32 vs. BF16: 0.00000140
 We also measured the difference between the outputs of ESM++ vs. ESMC (both in bfloat16) on 1000 random sequences to ensure compliance with the ESM package.
 Average MSE of last hidden state: 7.74e-10
 You can load the weights from the ESM package instead of transformers by replacing .from_pretrained(...) to .from_pretrained_esm('esmc_300m')