Future plans for the model ?
#2
by
danielschnell
- opened
Just wanted to ask, if there are any plans to update the model with more text resources ?
The Icelandic government funded SIM project has created a large amount of language resources that could be used to improve the model even further:
These Icelandic text resources are available at https://clarin.is/en/resources
E.g. text corpora, notably from a size POV:
Icelandic Gigaword Corpus (IGC, 8,2GB)
Icelandic Common Crawl Corpus ( ICC, 4,9GB)
Maybe you could consider adding these resources for a future update of the model ?