A protein sequence tokenizer trained on PDB Sequences with vocabulary size = 1024
vocabulary size = 1024
-