model adds spaces after special charachters
#4
by
sepehrJafari
- opened
You have one of the best NER repositories for Italian language.
I notice a problem about this particular model. I notice the model output inserts spaces after each special characters
www.ABCD.com => www. ABCD. com
this is how I create the pipeline
tokenizer = AutoTokenizer.from_pretrained("DeepMount00/Italian_NER_XXL")
model = AutoModelForTokenClassification.from_pretrained("DeepMount00/Italian_NER_XXL", ignore_mismatched_sizes=True)
nlp = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy = 'simple')