Spaces:
Running
Running
title: README | |
emoji: π | |
colorFrom: indigo | |
colorTo: red | |
sdk: static | |
pinned: false | |
The merge crew is the mergiest crew. π | |
## Merge Crew Planning Document | |
https://docs.google.com/document/d/1fP2FIrCifWcLGdTBmqeogdCdZJOwxqPfEyO-HA76_qc/edit?usp=sharing | |
## Merging tutorial | |
https://huggingface.co/blog/mlabonne/merge-models | |
## Colab for merging | |
Lazy merge kit notebook for merging models. | |
https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing | |
## Model Merging Discord | |
Discord channel for discussions of model mergin. | |
https://discord.com/channels/905500133343518791/1202582325146161183 | |
## Merge methods | |
When working you use a YAML file like the following | |
``` | |
models: | |
- model: timpal0l/BeagleCatMunin | |
# No parameters necessary for base model | |
- model: bineric/NorskGPT-Mistral-7b | |
parameters: | |
density: 0.53 | |
weight: 0.6 | |
merge_method: dare_ties | |
base_model: timpal0l/BeagleCatMunin | |
parameters: | |
int8_mask: true | |
dtype: bfloat16 | |
random_seed: 42 | |
``` | |
The dare_ties methods seems to perform better than other merging methods. | |
Learn more about merge ties here. | |
https://arxiv.org/pdf/2306.01708.pdf | |