--- language: en license: mit datasets: - ronig/pdb_sequences --- # PDB Protein BPE Tokenizer A protein sequence tokenizer trained on [PDB Sequences](https://huggingface.co/datasets/ronig/pdb_sequences) with `vocabulary size = 1024`