# Text preprocessing | |
This tokenizer has been trained with tweets that have been preprocessed as follows: | |
1) User mentions (@user_name) have been replaced with the word *user*. | |
2) URLs have been replace with the word *url*. | |
3) WIP. | |
If you are going to use this tokenizer, we recommend you to preprocess your own dataset in the same manner. | |