Arnold's picture
add tokenizer
c9fa6c7
raw
history blame
348 Bytes
{"’": 0, "u": 1, "x": 2, "m": 3, "í": 4, "q": 5, "f": 6, "k": 7, "g": 8, "d": 9, "t": 10, "i": 11, "o": 13, "s": 14, "y": 15, "z": 16, "c": 17, "ƙ": 18, "l": 19, "w": 20, "e": 21, "p": 23, "h": 24, "r": 25, "ƴ": 26, "b": 27, "ʻ": 28, "a": 29, "'": 30, "ɗ": 31, "ɓ": 32, "j": 33, "n": 34, "v": 35, "|": 12, "/": 22, "[UNK]": 36, "[PAD]": 37}