I帽igo L贸pez-Riob贸o Botana
commited on
Commit
路
6f72295
1
Parent(s):
b27c12e
Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,10 @@ tags:
|
|
9 |
- gpt
|
10 |
- gpt2
|
11 |
- text-generation
|
|
|
|
|
|
|
|
|
12 |
inference: false
|
13 |
---
|
14 |
|
@@ -23,7 +27,7 @@ We used one of the datasets available in the [Bot Framework Tools repository](ht
|
|
23 |
|
24 |
## Example inference script
|
25 |
|
26 |
-
### Check at this example script to run
|
27 |
|
28 |
```python
|
29 |
import torch
|
@@ -84,7 +88,7 @@ for i in range(CHAT_TURNS):
|
|
84 |
| Warmup training steps (%) | 6% |
|
85 |
| Weight decay | 0.01 |
|
86 |
| Optimiser (beta1, beta2, epsilon) | AdamW (0.9, 0.999, 1e-08) |
|
87 |
-
| Monitoring metric (delta, patience) |
|
88 |
|
89 |
|
90 |
## Fine-tuning in a different dataset or style
|
@@ -94,6 +98,7 @@ You can check the [original GitHub repository](https://github.com/microsoft/Dial
|
|
94 |
|
95 |
## Limitations
|
96 |
|
|
|
97 |
- This model is intended to be used **just for single-turn chitchat conversations in Spanish**.
|
98 |
- This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset.
|
99 |
- This model generates short answers, providing general context dialogue in a professional style.
|
|
|
9 |
- gpt
|
10 |
- gpt2
|
11 |
- text-generation
|
12 |
+
- spanish
|
13 |
+
- dialogpt
|
14 |
+
- chitchat
|
15 |
+
- ITG
|
16 |
inference: false
|
17 |
---
|
18 |
|
|
|
27 |
|
28 |
## Example inference script
|
29 |
|
30 |
+
### Check at this example script to run our model in inference mode
|
31 |
|
32 |
```python
|
33 |
import torch
|
|
|
88 |
| Warmup training steps (%) | 6% |
|
89 |
| Weight decay | 0.01 |
|
90 |
| Optimiser (beta1, beta2, epsilon) | AdamW (0.9, 0.999, 1e-08) |
|
91 |
+
| Monitoring metric (delta, patience) | Validation loss (0.1, 3) |
|
92 |
|
93 |
|
94 |
## Fine-tuning in a different dataset or style
|
|
|
98 |
|
99 |
## Limitations
|
100 |
|
101 |
+
- This model uses the original English-based tokenizer from [GPT-2 paper](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf). Spanish tokenization is not considered but it has similarities in grammatical structure for encoding text. This overlap may help the model transfer its knowledge from English to Spanish.
|
102 |
- This model is intended to be used **just for single-turn chitchat conversations in Spanish**.
|
103 |
- This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset.
|
104 |
- This model generates short answers, providing general context dialogue in a professional style.
|