|
--- |
|
license: apache-2.0 |
|
tags: |
|
- merge |
|
- mergekit |
|
- lazymergekit |
|
- gordicaleksa/YugoGPT |
|
- HuggingFaceH4/zephyr-7b-beta |
|
--- |
|
|
|
# Zamfir-7B-slerp |
|
|
|
Zamfir-7B-slerp is a merge of the following models using [mergekit](https://github.com/cg123/mergekit): |
|
* [gordicaleksa/YugoGPT](https://huggingface.co/gordicaleksa/YugoGPT) |
|
* [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) |
|
|
|
## 🧩 Configuration |
|
|
|
```yaml |
|
slices: |
|
- sources: |
|
- model: gordicaleksa/YugoGPT |
|
layer_range: [0, 32] |
|
- model: HuggingFaceH4/zephyr-7b-beta |
|
layer_range: [0, 32] |
|
merge_method: slerp |
|
base_model: HuggingFaceH4/zephyr-7b-beta |
|
parameters: |
|
t: |
|
- filter: self_attn |
|
value: [0, 0.5, 0.3, 0.7, 1] |
|
- filter: mlp |
|
value: [1, 0.5, 0.7, 0.3, 0] |
|
- value: 0.5 |
|
dtype: bfloat16 |
|
``` |
|
|
|
## Results |
|
|
|
| | ARC-E | ARC-C | Hellaswag | BoolQ | Winogrande | OpenbookQA | PiQA | NQ Open | TriviaQA | **Avg.** | |
|
|-----------|-------|-------|-----------|-------|------------|------------|-------|---------|----------|-------| |
|
| Zamfir-7B | 51.85 | 32.25 | 46.03 | 75.59 | 62.59 | 26.00 | 66.81 | 16.09 | 36.11 | 45.92 | |
|
| Mustra-7B | 52.95 | 33.70 | 45.89 | 77.55 | 64.17 | 30.60 | 67.25 | 15.40 | 34.84 | 46.93 | |