I帽igo L贸pez-Riob贸o Botana commited on
Commit
523c7a4
1 Parent(s): 8e5fab5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -3
README.md CHANGED
@@ -20,8 +20,13 @@ inference: false
20
 
21
  ## Description
22
 
23
- This is a **transformer-decoder** [GPT-2 model](https://huggingface.co/gpt2), adapted for **single-turn dialogue tasks in Spanish**. We fine-tuned a [DialoGPT-medium](https://huggingface.co/microsoft/DialoGPT-medium) 345M parameters model from Microsoft, following the CLM (Causal Language Modelling) objective.
24
- We used one of the datasets available in the [Bot Framework Tools repository](https://github.com/microsoft/botframework-cli). We processed [the professional-styled personality chat dataset in Spanish](https://github.com/microsoft/botframework-cli/blob/main/packages/qnamaker/docs/chit-chat-dataset.md), the file is available [here to download](https://qnamakerstore.blob.core.windows.net/qnamakerdata/editorial/spanish/qna_chitchat_professional.tsv)
 
 
 
 
 
25
 
26
  ---
27
 
@@ -105,4 +110,4 @@ You can check the [original GitHub repository](https://github.com/microsoft/Dial
105
  > Since our approach can assign a probability to any Unicode string, this allows us to evaluate our LMs on any dataset regardless of pre-processing, tokenization, or vocab size.
106
  - This model is intended to be used **just for single-turn chitchat conversations in Spanish**.
107
  - This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset.
108
- - This model generates short answers, providing general context dialogue in a professional style.
 
20
 
21
  ## Description
22
 
23
+ This is a **transformer-decoder** [GPT-2 model](https://huggingface.co/gpt2), adapted for the **single-turn dialogue task in Spanish**. We fine-tuned a [DialoGPT-medium](https://huggingface.co/microsoft/DialoGPT-medium) 345M parameter model from Microsoft, following the CLM (Causal Language Modelling) objective.
24
+
25
+ ---
26
+
27
+ ## Dataset
28
+
29
+ We used one of the datasets available in the [Bot Framework Tools repository](https://github.com/microsoft/botframework-cli). We processed [the professional-styled personality chat dataset in Spanish](https://github.com/microsoft/botframework-cli/blob/main/packages/qnamaker/docs/chit-chat-dataset.md), the file is available [to download here](https://qnamakerstore.blob.core.windows.net/qnamakerdata/editorial/spanish/qna_chitchat_professional.tsv)
30
 
31
  ---
32
 
 
110
  > Since our approach can assign a probability to any Unicode string, this allows us to evaluate our LMs on any dataset regardless of pre-processing, tokenization, or vocab size.
111
  - This model is intended to be used **just for single-turn chitchat conversations in Spanish**.
112
  - This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset.
113
+ - This model generates short answers, providing general context dialogue in a professional style for the Spanish language.