jorgeortizfuentes
commited on
Commit
•
5569b60
1
Parent(s):
647c107
Update README.md
Browse files
README.md
CHANGED
@@ -5,31 +5,16 @@ tags:
|
|
5 |
- masked-lm
|
6 |
license: gpl-3.0
|
7 |
datasets:
|
8 |
-
- jorgeortizfuentes/
|
|
|
9 |
pipeline_tag: fill-mask
|
10 |
---
|
11 |
|
12 |
-
#
|
13 |
|
14 |
-
This model is a fine-tuned version of [dccuchile/bert-base-spanish-wwm-cased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased) on Chilean Spanish Corpus.
|
15 |
-
It achieves the following results on the evaluation set:
|
16 |
-
- Loss: 1.765
|
17 |
|
18 |
-
##
|
19 |
-
|
20 |
-
More information needed
|
21 |
-
|
22 |
-
## Intended uses & limitations
|
23 |
-
|
24 |
-
More information needed
|
25 |
-
|
26 |
-
## Training and evaluation data
|
27 |
-
|
28 |
-
More information needed
|
29 |
-
|
30 |
-
## Training procedure
|
31 |
-
|
32 |
-
### Training hyperparameters
|
33 |
|
34 |
The following hyperparameters were used during training:
|
35 |
- learning_rate: 5e-05
|
@@ -44,9 +29,14 @@ The following hyperparameters were used during training:
|
|
44 |
- lr_scheduler_type: linear
|
45 |
- num_epochs: 2.0
|
46 |
|
47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
-
|
50 |
-
- Pytorch 1.13.1+cu117
|
51 |
-
- Datasets 2.9.0
|
52 |
-
- Tokenizers 0.13.2
|
|
|
5 |
- masked-lm
|
6 |
license: gpl-3.0
|
7 |
datasets:
|
8 |
+
- jorgeortizfuentes/spanish_books
|
9 |
+
- jorgeortizfuentes/small-chilean-spanish-corpus
|
10 |
pipeline_tag: fill-mask
|
11 |
---
|
12 |
|
13 |
+
# Tulio
|
14 |
|
15 |
+
Tulio is a BERT model trained with Chilean Spanish. This model is a fine-tuned version of [dccuchile/bert-base-spanish-wwm-cased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased) on Spanish Books and Small Chilean Spanish Corpus.
|
|
|
|
|
16 |
|
17 |
+
## Training hyperparameters
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
The following hyperparameters were used during training:
|
20 |
- learning_rate: 5e-05
|
|
|
29 |
- lr_scheduler_type: linear
|
30 |
- num_epochs: 2.0
|
31 |
|
32 |
+
## Acknowledgments
|
33 |
+
|
34 |
+
We are grateful for the servers provided by the Computer Science Department of the University of Chile and the ReLeLa (Representations for Learning and Language) study group for the training of the model.
|
35 |
+
|
36 |
+
## License Disclaimer
|
37 |
+
|
38 |
+
The license gpl-3.0 best describes our intentions for our work. However we are not sure that all the datasets used to train the model have licenses compatible with gpl-3.0. Please use at your own discretion and verify that the licenses of the original text resources match your needs.
|
39 |
+
|
40 |
+
## Limitations
|
41 |
|
42 |
+
The training dataset was not censored in any way. Therefore, the model may contain unwanted ideological representations. Use with caution.
|
|
|
|
|
|