SanjiWatsuki
commited on
Commit
·
b6bfd65
1
Parent(s):
d567e49
Update README.md
Browse files
README.md
CHANGED
@@ -22,7 +22,7 @@ language:
|
|
22 |
|
23 |
Lelantos-7B is a merge with a twist. Many of the existing merged models which score highly on the Open LLM Leaderboard often have weird issues in real world use. When I tested models like the heavily merged [Marcoroni-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3) derivatives, I would often see surprisingly poor MT-Bench scores. I suspect that removing the special tokens (like their EOS token!) in Frankenmerges negatively impacted some of these models.
|
24 |
|
25 |
-
Lelantos-7B is a merger of deeply merged everything-on-a-bagel models but with the EOS token remapped from
|
26 |
|
27 |
By weight, it's almost entirely DPOpenHermes-v2 but those extra bits from the merger of mergers and Stealth v1.2 really help it shine.
|
28 |
|
|
|
22 |
|
23 |
Lelantos-7B is a merge with a twist. Many of the existing merged models which score highly on the Open LLM Leaderboard often have weird issues in real world use. When I tested models like the heavily merged [Marcoroni-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3) derivatives, I would often see surprisingly poor MT-Bench scores. I suspect that removing the special tokens (like their EOS token!) in Frankenmerges negatively impacted some of these models.
|
24 |
|
25 |
+
Lelantos-7B is a merger of deeply merged everything-on-a-bagel models but with the EOS token remapped from `</s>` to `<im_end>` through manually editing the tokenizer JSONs. MergeKit, under the hood, will remap this properly when merged back with a proper ChatML model like [DPOpenHermes-v2](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B-v2) that has the special <im_end> token still mapped. Additionally, I merged in [jan-hq/stealth-v1.2](https://huggingface.co/jan-hq/stealth-v1.2) - a model which I found to be unremarkable by itself but shockingly effective when used as an extra seasoning on the merger (also, it's a ChatML model).
|
26 |
|
27 |
By weight, it's almost entirely DPOpenHermes-v2 but those extra bits from the merger of mergers and Stealth v1.2 really help it shine.
|
28 |
|