SanjiWatsuki
/

Lelantos-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SanjiWatsuki commited on Dec 30, 2023

Commit

b6bfd65

·

1 Parent(s): d567e49

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ language:
 Lelantos-7B is a merge with a twist. Many of the existing merged models which score highly on the Open LLM Leaderboard often have weird issues in real world use. When I tested models like the heavily merged [Marcoroni-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3) derivatives, I would often see surprisingly poor MT-Bench scores. I suspect that removing the special tokens (like their EOS token!) in Frankenmerges negatively impacted some of these models.
-Lelantos-7B is a merger of deeply merged everything-on-a-bagel models but with the EOS token remapped from </s> to <im_end> through manually editing the tokenizer JSONs. MergeKit, under the hood, will remap this properly when merged back with a proper ChatML model like [DPOpenHermes-v2](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B-v2) that has the special <im_end> token still mapped. Additionally, I merged in [jan-hq/stealth-v1.2](https://huggingface.co/jan-hq/stealth-v1.2) - a model which I found to be unremarkable by itself but shockingly effective when used as an extra seasoning on the merger (also, it's a ChatML model).
 By weight, it's almost entirely DPOpenHermes-v2 but those extra bits from the merger of mergers and Stealth v1.2 really help it shine.

 Lelantos-7B is a merge with a twist. Many of the existing merged models which score highly on the Open LLM Leaderboard often have weird issues in real world use. When I tested models like the heavily merged [Marcoroni-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3) derivatives, I would often see surprisingly poor MT-Bench scores. I suspect that removing the special tokens (like their EOS token!) in Frankenmerges negatively impacted some of these models.
+Lelantos-7B is a merger of deeply merged everything-on-a-bagel models but with the EOS token remapped from `</s>` to `<im_end>` through manually editing the tokenizer JSONs. MergeKit, under the hood, will remap this properly when merged back with a proper ChatML model like [DPOpenHermes-v2](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B-v2) that has the special <im_end> token still mapped. Additionally, I merged in [jan-hq/stealth-v1.2](https://huggingface.co/jan-hq/stealth-v1.2) - a model which I found to be unremarkable by itself but shockingly effective when used as an extra seasoning on the merger (also, it's a ChatML model).
 By weight, it's almost entirely DPOpenHermes-v2 but those extra bits from the merger of mergers and Stealth v1.2 really help it shine.