File size: 3,940 Bytes

7886d18
 
 
419ba3c
 
 
 
679182a
7886d18
 
 
 
 
 
5db79d8
7886d18
3a89443
c00b53e
 
c2aaa2d
2470754
b556f64
aa84a25
c00b53e
3a89443
7886d18
 
aad8879
7886d18
 
 
 
aad8879
7886d18
 
 
 
419ba3c
 
679182a
419ba3c
7886d18
16d91e6
7886d18
 
7c2c628
7886d18
 
 
 
419ba3c
b723c6f
1c29bfa
 
 
 
 
 
 
 
 
 
 
 
b723c6f
 
 
5db79d8
1c29bfa
 
 
 
 
 
 
 
 
 
 
 
5db79d8
b723c6f
 
1c29bfa
 
 
 
 
 
 
 
 
 
 
 
419ba3c
b723c6f
 
 
7886d18
 
 
 
2a9d0da
7886d18
 
 
 
 
2ad5639
 
19abbcf
aad8879

---
base_model:
- v000000/L3.1-8B-RP-Test-003-Task_Arithmetic
- v000000/L3.1-Niitorm-8B-t0.0001
- Sao10K/L3.1-8B-Niitama-v1.1
- arcee-ai/Llama-3.1-SuperNova-Lite
- akjindal53244/Llama-3.1-Storm-8B
- arcee-ai/Llama-Spark
- v000000/L3.1-8B-RP-Test-002-Task_Arithmetic
- grimjim/Llama-3-Instruct-abliteration-LoRA-8B
library_name: transformers
tags:
- mergekit
- merge
- llama
---

# Llama-3.1-Storniitova-8B

Storniitova-8B is a RP/Instruct model built on the foundation of Llama-3.1-SuperNova-Lite, which is distilled from the 405B parameter variant of Llama-3.1

By only changing the vector tasks, I attempt to retain the full distillation while learning roleplaying capabilties.

-----------------------------------------------------------------------------------------------------------

# merge

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit) and other proprietary tools.

## Merge Details
### Merge Method

This model was merged using the <b>SLERP, Task_Arithmetic and NEARSWAP</b> merge method.

### Models Merged

The following models were included in the merge:
* [v000000/L3.1-Niitorm-8B-t0.0001](https://huggingface.co/v000000/L3.1-Niitorm-8B-t0.0001)
* [akjindal53244/Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)
* [arcee-ai/Llama-Spark](https://huggingface.co/arcee-ai/Llama-Spark)
* [arcee-ai/Llama-3.1-SuperNova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite)
* [v000000/L3.1-8B-RP-Test-003-Task_Arithmetic](https://huggingface.co/v000000/L3.1-8B-RP-Test-003-Task_Arithmetic)
* [Sao10K/L3.1-8B-Niitama-v1.1](https://huggingface.co/Sao10K/L3.1-8B-Niitama-v1.1) + [grimjim/Llama-3-Instruct-abliteration-LoRA-8B](https://huggingface.co/grimjim/Llama-3-Instruct-abliteration-LoRA-8B)
* [v000000/L3.1-8B-RP-Test-002-Task_Arithmetic](https://huggingface.co/v000000/L3.1-8B-RP-Test-002-Task_Arithmetic) + [grimjim/Llama-3-Instruct-abliteration-LoRA-8B](https://huggingface.co/grimjim/Llama-3-Instruct-abliteration-LoRA-8B)

### Recipe

The following YAML configuration was used to produce this model:

```yaml
#Step1 - Add smarts to Niitama with alchemonaut's algorithm.

slices:
  - sources:
      - model: Sao10K/L3.1-8B-Niitama-v1.1+grimjim/Llama-3-Instruct-abliteration-LoRA-8B
        layer_range: [0, 32]
      - model: akjindal53244/Llama-3.1-Storm-8B
        layer_range: [0, 32]
merge_method: nearswap
base_model: Sao10K/L3.1-8B-Niitama-v1.1+grimjim/Llama-3-Instruct-abliteration-LoRA-8B
parameters:
  t:
    - value: 0.0001
dtype: bfloat16
out_type: float16

#Step 2 - Learn vectors onto Supernova 0.4(Niitorm)

models:
  - model: arcee-ai/Llama-3.1-SuperNova-Lite
    parameters:
      weight: 1.0
  - model: v000000/L3.1-Niitorm-8B-t0.0001
    parameters:
      weight: 0.4
merge_method: task_arithmetic
base_model: arcee-ai/Llama-3.1-SuperNova-Lite
parameters:
    normalize: false
dtype: float16

#Step 3 - Fully learn vectors onto Supernova 1.25(Niitorm)

models:
  - model: arcee-ai/Llama-3.1-SuperNova-Lite
    parameters:
      weight: 0.0
  - model: v000000/L3.1-Niitorm-8B-t0.0001
    parameters:
      weight: 1.25
merge_method: task_arithmetic
base_model: arcee-ai/Llama-3.1-SuperNova-Lite
parameters:
    normalize: false
dtype: float16

#Step 4 - Merge checkpoints and keep output/input Supernova heavy
#Merge with a triangular slerp from sophosympatheia.

models:
  - model: v000000/L3.1-8B-RP-Test-003-Task_Arithmetic
merge_method: slerp
base_model: v000000/L3.1-8B-RP-Test-002-Task_Arithmetic+grimjim/Llama-3-Instruct-abliteration-LoRA-8B
# This model needed some abliteration^
parameters:
  t:
    - value: [0, 0, 0.3, 0.4, 0.5, 0.6, 0.5, 0.4, 0.3, 0, 0]
dtype: float16

```

*SLERP distribution used to smoothly blend the mostly Supernova base with the 100% roleplay vectors:*
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/GP2LMRvMkhVJwNDSEC4oU.png)