I帽igo L贸pez-Riob贸o Botana commited on
Commit
6f72295
1 Parent(s): b27c12e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -9,6 +9,10 @@ tags:
9
  - gpt
10
  - gpt2
11
  - text-generation
 
 
 
 
12
  inference: false
13
  ---
14
 
@@ -23,7 +27,7 @@ We used one of the datasets available in the [Bot Framework Tools repository](ht
23
 
24
  ## Example inference script
25
 
26
- ### Check at this example script to run this model in inference mode
27
 
28
  ```python
29
  import torch
@@ -84,7 +88,7 @@ for i in range(CHAT_TURNS):
84
  | Warmup training steps (%) | 6% |
85
  | Weight decay | 0.01 |
86
  | Optimiser (beta1, beta2, epsilon) | AdamW (0.9, 0.999, 1e-08) |
87
- | Monitoring metric (delta, patience) | validation loss (0.1, 3) |
88
 
89
 
90
  ## Fine-tuning in a different dataset or style
@@ -94,6 +98,7 @@ You can check the [original GitHub repository](https://github.com/microsoft/Dial
94
 
95
  ## Limitations
96
 
 
97
  - This model is intended to be used **just for single-turn chitchat conversations in Spanish**.
98
  - This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset.
99
  - This model generates short answers, providing general context dialogue in a professional style.
 
9
  - gpt
10
  - gpt2
11
  - text-generation
12
+ - spanish
13
+ - dialogpt
14
+ - chitchat
15
+ - ITG
16
  inference: false
17
  ---
18
 
 
27
 
28
  ## Example inference script
29
 
30
+ ### Check at this example script to run our model in inference mode
31
 
32
  ```python
33
  import torch
 
88
  | Warmup training steps (%) | 6% |
89
  | Weight decay | 0.01 |
90
  | Optimiser (beta1, beta2, epsilon) | AdamW (0.9, 0.999, 1e-08) |
91
+ | Monitoring metric (delta, patience) | Validation loss (0.1, 3) |
92
 
93
 
94
  ## Fine-tuning in a different dataset or style
 
98
 
99
  ## Limitations
100
 
101
+ - This model uses the original English-based tokenizer from [GPT-2 paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). Spanish tokenization is not considered but it has similarities in grammatical structure for encoding text. This overlap may help the model transfer its knowledge from English to Spanish.
102
  - This model is intended to be used **just for single-turn chitchat conversations in Spanish**.
103
  - This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset.
104
  - This model generates short answers, providing general context dialogue in a professional style.