bert-tv-portuguese / README.md
jcfneto's picture
Update README.md
f872876
---
license: mit
language:
- pt
model-index:
- name: bert-tv-portuguese
results: []
---
# BERT-TV
<img src="https://cdn-uploads.huggingface.co/production/uploads/6385e26cc12615765caa6afe/3lSkNEfW57BNudZIFyTH2.png" width=400 height=400>
Image generated by ChatGPT with DALL-E from OpenAI.
## Model description
BERT-TV is a BERT model specifically pre-trained from scratch on a dataset of television reviews in Brazilian Portuguese.
This model is tailored to grasp the nuances and specificities associated with the context and sentiment expressed in
television reviews. BERT-TV features 6 layers, 12 attention heads, and an embedding dimension of 768, making it adept at
handling NLP tasks related to television content in Portuguese.
## Usage ideas
- Sentiment analysis on television reviews in Portuguese
- Recommender systems for television models in Portuguese
- Text classification for different television brands and types in Portuguese
- Named entity recognition in television-related contexts in Portuguese
- Aspect extraction for features and specifications of televisions in Portuguese
- Text generation for summarizing television reviews in Portuguese
## Limitations and bias
As the BERT-TV model is exclusively pre-trained on television reviews in Brazilian Portuguese, its performance may be
limited when applied to other types of text or reviews in different languages. Furthermore, the model could inherit
biases present in the training data, which may influence its predictions or embeddings. The tokenizer is adapted from
the BERTimbau tokenizer, optimized for Brazilian Portuguese, thus it might not deliver optimal results with other
languages or Portuguese dialects.
## Framework versions
- Transformers 4.27.3
- TensorFlow 2.11.1
- Datasets 2.11.0
- Tokenizers 0.13.3