SanjiWatsuki commited on
Commit
b6bfd65
·
1 Parent(s): d567e49

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -22,7 +22,7 @@ language:
22
 
23
  Lelantos-7B is a merge with a twist. Many of the existing merged models which score highly on the Open LLM Leaderboard often have weird issues in real world use. When I tested models like the heavily merged [Marcoroni-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3) derivatives, I would often see surprisingly poor MT-Bench scores. I suspect that removing the special tokens (like their EOS token!) in Frankenmerges negatively impacted some of these models.
24
 
25
- Lelantos-7B is a merger of deeply merged everything-on-a-bagel models but with the EOS token remapped from </s> to <im_end> through manually editing the tokenizer JSONs. MergeKit, under the hood, will remap this properly when merged back with a proper ChatML model like [DPOpenHermes-v2](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B-v2) that has the special <im_end> token still mapped. Additionally, I merged in [jan-hq/stealth-v1.2](https://huggingface.co/jan-hq/stealth-v1.2) - a model which I found to be unremarkable by itself but shockingly effective when used as an extra seasoning on the merger (also, it's a ChatML model).
26
 
27
  By weight, it's almost entirely DPOpenHermes-v2 but those extra bits from the merger of mergers and Stealth v1.2 really help it shine.
28
 
 
22
 
23
  Lelantos-7B is a merge with a twist. Many of the existing merged models which score highly on the Open LLM Leaderboard often have weird issues in real world use. When I tested models like the heavily merged [Marcoroni-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3) derivatives, I would often see surprisingly poor MT-Bench scores. I suspect that removing the special tokens (like their EOS token!) in Frankenmerges negatively impacted some of these models.
24
 
25
+ Lelantos-7B is a merger of deeply merged everything-on-a-bagel models but with the EOS token remapped from `</s>` to `<im_end>` through manually editing the tokenizer JSONs. MergeKit, under the hood, will remap this properly when merged back with a proper ChatML model like [DPOpenHermes-v2](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B-v2) that has the special <im_end> token still mapped. Additionally, I merged in [jan-hq/stealth-v1.2](https://huggingface.co/jan-hq/stealth-v1.2) - a model which I found to be unremarkable by itself but shockingly effective when used as an extra seasoning on the merger (also, it's a ChatML model).
26
 
27
  By weight, it's almost entirely DPOpenHermes-v2 but those extra bits from the merger of mergers and Stealth v1.2 really help it shine.
28