Update README.md
Browse files
README.md
CHANGED
@@ -163,7 +163,7 @@ __*Only train splits are used, and a decontamination by cosine similarity is per
|
|
163 |
|
164 |
## Prompt formatting
|
165 |
|
166 |
-
In sticking with the theme of the bagel, I didn't want to use a single prompt format, so I used 4 - vicuna, llama-2, alpaca, and chat-ml.
|
167 |
I also didn't want to randomly select a single prompt format for each item (hoping each instruction would generalize more when used in a variety of prompt formats), so each instruction is converted into every prompt format (with 0.75 probability).
|
168 |
|
169 |
This means each epoch of our fine-tune is the equivalent of 3 epochs.
|
@@ -223,12 +223,14 @@ print(tokenizer.apply_chat_template(chat, tokenize=False))
|
|
223 |
</details>
|
224 |
|
225 |
<details>
|
226 |
-
<summary><b>ChatML</b></summary>
|
|
|
|
|
227 |
|
228 |
```text
|
229 |
-
{bos}
|
230 |
{text}
|
231 |
-
|
232 |
```
|
233 |
</details>
|
234 |
|
|
|
163 |
|
164 |
## Prompt formatting
|
165 |
|
166 |
+
In sticking with the theme of the bagel, I didn't want to use a single prompt format, so I used 4 - vicuna, llama-2, alpaca, and a modified chat-ml.
|
167 |
I also didn't want to randomly select a single prompt format for each item (hoping each instruction would generalize more when used in a variety of prompt formats), so each instruction is converted into every prompt format (with 0.75 probability).
|
168 |
|
169 |
This means each epoch of our fine-tune is the equivalent of 3 epochs.
|
|
|
223 |
</details>
|
224 |
|
225 |
<details>
|
226 |
+
<summary><b>ChatML (sort of)</b></summary>
|
227 |
+
|
228 |
+
ChatML special tokens are really obnoxious, so instead of enlarging the tokenizer and embedding layers (which decreases performance and causes inference problems in tensor parallelism), I just use BOS and EOS tokens instead of `<|im_start|>` and `<|im_end|>` - and no, I won't change this.
|
229 |
|
230 |
```text
|
231 |
+
{bos}{role}
|
232 |
{text}
|
233 |
+
{eos}
|
234 |
```
|
235 |
</details>
|
236 |
|