jcfneto commited on
Commit
8dd57ac
·
1 Parent(s): 6b048b7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -21
README.md CHANGED
@@ -1,47 +1,56 @@
1
  ---
2
- tags:
3
- - generated_from_keras_callback
4
  model-index:
5
- - name: tmp2gtjrfa_
6
  results: []
 
 
 
7
  ---
8
 
9
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
10
  probably proofread and complete it, then remove this comment. -->
11
 
12
- # tmp2gtjrfa_
13
 
14
- This model was trained from scratch on an unknown dataset.
15
- It achieves the following results on the evaluation set:
16
 
17
 
18
  ## Model description
19
 
20
- More information needed
 
 
21
 
22
- ## Intended uses & limitations
23
 
24
- More information needed
 
 
25
 
26
- ## Training and evaluation data
27
 
28
- More information needed
29
 
30
- ## Training procedure
31
-
32
- ### Training hyperparameters
33
-
34
- The following hyperparameters were used during training:
35
- - optimizer: None
36
- - training_precision: float32
37
-
38
- ### Training results
39
 
 
 
 
 
 
 
40
 
 
 
 
 
 
 
 
41
 
42
  ### Framework versions
43
 
44
  - Transformers 4.21.3
45
  - TensorFlow 2.9.1
46
  - Datasets 2.7.0
47
- - Tokenizers 0.12.1
 
1
  ---
 
 
2
  model-index:
3
+ - name: bert-br
4
  results: []
5
+ license: mit
6
+ language:
7
+ - pt
8
  ---
9
 
10
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
11
  probably proofread and complete it, then remove this comment. -->
12
 
13
+ # BERT-BR
14
 
15
+ BERTBookReviews
 
16
 
17
 
18
  ## Model description
19
 
20
+ BERT-BR is a BERT model pre-trained from scratch on a dataset of literary book reviews in Brazilian Portuguese.
21
+ The model is specifically designed for understanding the context and sentiment of book reviews in Portuguese.
22
+ BERT-BR features 6 layers, 4 attention heads, and an embedding dimension of 768.
23
 
24
+ ## Training data
25
 
26
+ The BERT-BR model was pre-trained on a dataset of literary book reviews in Brazilian Portuguese.
27
+ The dataset comprises a diverse range of book genres and review sentiments, making the model
28
+ suitable for various book-related NLP tasks in Portuguese.
29
 
30
+ ## Evaluation
31
 
32
+ WIP.
33
 
34
+ ## Usage ideas
 
 
 
 
 
 
 
 
35
 
36
+ - Sentiment analysis on book reviews in Portuguese
37
+ - Book recommendation systems in Portuguese
38
+ - Text classification for book genres in Portuguese
39
+ - Named entity recognition in book-related contexts in Portuguese
40
+ - Aspect extraction in book-related contexts in Portuguese
41
+ - Text generation for book summaries in Portuguese
42
 
43
+ ## Limitations and bias
44
+ As the BERT-BR model was pre-trained on literary book reviews in Brazilian Portuguese,
45
+ it may not perform as well on other types of text or reviews in different languages.
46
+ Additionally, the model may inherit certain biases from the training data, which could
47
+ affect its predictions or embeddings. The tokenizer is based on the BERTimbau tokenizer,
48
+ which was specifically designed for Brazilian Portuguese text, so it might not work
49
+ well with other languages or Portuguese variants.
50
 
51
  ### Framework versions
52
 
53
  - Transformers 4.21.3
54
  - TensorFlow 2.9.1
55
  - Datasets 2.7.0
56
+ - Tokenizers 0.12.1