|
--- |
|
license: mit |
|
language: |
|
- pt |
|
model-index: |
|
- name: bert-tv-portuguese |
|
results: [] |
|
--- |
|
|
|
# BERT-TV |
|
|
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/6385e26cc12615765caa6afe/3lSkNEfW57BNudZIFyTH2.png" width=400 height=400> |
|
Image generated by ChatGPT with DALL-E from OpenAI. |
|
|
|
## Model description |
|
|
|
BERT-TV is a BERT model specifically pre-trained from scratch on a dataset of television reviews in Brazilian Portuguese. |
|
This model is tailored to grasp the nuances and specificities associated with the context and sentiment expressed in |
|
television reviews. BERT-TV features 6 layers, 12 attention heads, and an embedding dimension of 768, making it adept at |
|
handling NLP tasks related to television content in Portuguese. |
|
|
|
## Usage ideas |
|
|
|
- Sentiment analysis on television reviews in Portuguese |
|
- Recommender systems for television models in Portuguese |
|
- Text classification for different television brands and types in Portuguese |
|
- Named entity recognition in television-related contexts in Portuguese |
|
- Aspect extraction for features and specifications of televisions in Portuguese |
|
- Text generation for summarizing television reviews in Portuguese |
|
|
|
## Limitations and bias |
|
|
|
As the BERT-TV model is exclusively pre-trained on television reviews in Brazilian Portuguese, its performance may be |
|
limited when applied to other types of text or reviews in different languages. Furthermore, the model could inherit |
|
biases present in the training data, which may influence its predictions or embeddings. The tokenizer is adapted from |
|
the BERTimbau tokenizer, optimized for Brazilian Portuguese, thus it might not deliver optimal results with other |
|
languages or Portuguese dialects. |
|
|
|
## Framework versions |
|
|
|
- Transformers 4.27.3 |
|
- TensorFlow 2.11.1 |
|
- Datasets 2.11.0 |
|
- Tokenizers 0.13.3 |
|
|