sanchit-gandhi
commited on
Commit
·
b65535b
1
Parent(s):
baca495
Add pad token to tokenizer
Browse filesWhisper large is missing the pad token, which is otherwise added in the tiny-medium models (e.g. https://huggingface.co/openai/whisper-medium/blob/main/special_tokens_map.json#L125). This PR adds the pad token for the large checkpoint.
- special_tokens_map.json +1 -0
special_tokens_map.json
CHANGED
@@ -122,6 +122,7 @@
|
|
122 |
"rstrip": false,
|
123 |
"single_word": false
|
124 |
},
|
|
|
125 |
"unk_token": {
|
126 |
"content": "",
|
127 |
"lstrip": false,
|
|
|
122 |
"rstrip": false,
|
123 |
"single_word": false
|
124 |
},
|
125 |
+
"pad_token": "<|endoftext|>",
|
126 |
"unk_token": {
|
127 |
"content": "",
|
128 |
"lstrip": false,
|