jcfneto
/

bert-br-portuguese

Inference Endpoints

Model card Files Files and versions Community

jcfneto commited on Apr 20, 2023

Commit

8dd57ac

·

1 Parent(s): 6b048b7

Update README.md

Files changed (1) hide show

README.md +30 -21

README.md CHANGED Viewed

@@ -1,47 +1,56 @@
 ---
-tags:
-- generated_from_keras_callback
 model-index:
-- name: tmp2gtjrfa_
   results: []
 ---
 <!-- This model card has been generated automatically according to the information Keras had access to. You should
 probably proofread and complete it, then remove this comment. -->
-# tmp2gtjrfa_
-This model was trained from scratch on an unknown dataset.
-It achieves the following results on the evaluation set:
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- optimizer: None
-- training_precision: float32
-### Training results
 ### Framework versions
 - Transformers 4.21.3
 - TensorFlow 2.9.1
 - Datasets 2.7.0
-- Tokenizers 0.12.1

 ---
 model-index:
+- name: bert-br
   results: []
+license: mit
+language:
+- pt
 ---
 <!-- This model card has been generated automatically according to the information Keras had access to. You should
 probably proofread and complete it, then remove this comment. -->
+# BERT-BR
+BERTBookReviews
 ## Model description
+BERT-BR is a BERT model pre-trained from scratch on a dataset of literary book reviews in Brazilian Portuguese.
+The model is specifically designed for understanding the context and sentiment of book reviews in Portuguese.
+BERT-BR features 6 layers, 4 attention heads, and an embedding dimension of 768.
+## Training data
+The BERT-BR model was pre-trained on a dataset of literary book reviews in Brazilian Portuguese.
+The dataset comprises a diverse range of book genres and review sentiments, making the model
+suitable for various book-related NLP tasks in Portuguese.
+## Evaluation
+WIP.
+## Usage ideas
+- Sentiment analysis on book reviews in Portuguese
+- Book recommendation systems in Portuguese
+- Text classification for book genres in Portuguese
+- Named entity recognition in book-related contexts in Portuguese
+- Aspect extraction in book-related contexts in Portuguese
+- Text generation for book summaries in Portuguese
+## Limitations and bias
+As the BERT-BR model was pre-trained on literary book reviews in Brazilian Portuguese,
+it may not perform as well on other types of text or reviews in different languages.
+Additionally, the model may inherit certain biases from the training data, which could
+affect its predictions or embeddings. The tokenizer is based on the BERTimbau tokenizer,
+which was specifically designed for Brazilian Portuguese text, so it might not work
+well with other languages or Portuguese variants.
 ### Framework versions
 - Transformers 4.21.3
 - TensorFlow 2.9.1
 - Datasets 2.7.0
+- Tokenizers 0.12.1