Schisandra
Many thanks to the authors of the models used!
RPMax v1.1 | Pantheon-RP | UnslopSmall-v1 | Magnum V4 | ChatWaifu v2.0 | SorcererLM | Acolyte | NovusKyver | Meadowlark | Sunfall
Overview
Main uses: RP, Storywriting
Prompt format: Mistral-V3
An intelligent model that is attentive to details and has a low-slop writing style. This time with a stable tokenizer.
Oh, and it now contains 10 finetunes! Not sure if some of them actually contribute to the output, but it's nice to see the numbers growing.
Quants
Settings
My SillyTavern preset: https://huggingface.co/Nohobby/MS-Schisandra-22B-v0.2/resolve/main/ST-formatting-Schisandra.json
Merge Details
Merging steps
Step1
(Config partially taken from here)
base_model: spow12/ChatWaifu_v2.0_22B
parameters:
int8_mask: true
rescale: true
normalize: false
dtype: bfloat16
tokenizer_source: base
merge_method: della
models:
- model: Envoid/Mistral-Small-NovusKyver
parameters:
density: [0.35, 0.65, 0.5, 0.65, 0.35]
epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
lambda: 0.85
weight: [-0.01891, 0.01554, -0.01325, 0.01791, -0.01458]
- model: rAIfle/Acolyte-22B
parameters:
density: [0.6, 0.4, 0.5, 0.4, 0.6]
epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
lambda: 0.85
weight: [0.01847, -0.01468, 0.01503, -0.01822, 0.01459]
Step2
(Config partially taken from here)
base_model: InferenceIllusionist/SorcererLM-22B
parameters:
int8_mask: true
rescale: true
normalize: false
dtype: bfloat16
tokenizer_source: base
merge_method: della
models:
- model: crestf411/MS-sunfall-v0.7.0
parameters:
density: [0.35, 0.65, 0.5, 0.65, 0.35]
epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
lambda: 0.85
weight: [-0.01891, 0.01554, -0.01325, 0.01791, -0.01458]
- model: anthracite-org/magnum-v4-22b
parameters:
density: [0.6, 0.4, 0.5, 0.4, 0.6]
epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
lambda: 0.85
weight: [0.01847, -0.01468, 0.01503, -0.01822, 0.01459]
SchisandraVA2
(Config taken from here)
merge_method: della_linear
dtype: bfloat16
parameters:
normalize: true
int8_mask: true
tokenizer_source: base
base_model: TheDrummer/UnslopSmall-22B-v1
models:
- model: ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1
parameters:
density: 0.55
weight: 1
- model: Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small
parameters:
density: 0.55
weight: 1
- model: Step1
parameters:
density: 0.55
weight: 1
- model: allura-org/MS-Meadowlark-22B
parameters:
density: 0.55
weight: 1
- model: Step2
parameters:
density: 0.55
weight: 1
Schisandra-v0.2
dtype: bfloat16
tokenizer_source: base
merge_method: della_linear
parameters:
density: 0.5
base_model: SchisandraVA2
models:
- model: unsloth/Mistral-Small-Instruct-2409
parameters:
weight:
- filter: v_proj
value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
- filter: o_proj
value: [1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1]
- filter: up_proj
value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
- filter: gate_proj
value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
- filter: down_proj
value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- value: 0
- model: SchisandraVA2
parameters:
weight:
- filter: v_proj
value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
- filter: o_proj
value: [0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0]
- filter: up_proj
value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- filter: gate_proj
value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
- filter: down_proj
value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
- value: 1
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 30.22 |
IFEval (0-Shot) | 63.83 |
BBH (3-Shot) | 40.61 |
MATH Lvl 5 (4-Shot) | 19.94 |
GPQA (0-shot) | 11.41 |
MuSR (0-shot) | 10.67 |
MMLU-PRO (5-shot) | 34.85 |
- Downloads last month
- 15
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Nohobby/MS-Schisandra-22B-v0.2
Merge model
this model
Collection including Nohobby/MS-Schisandra-22B-v0.2
Collection
4 items
•
Updated
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard63.830
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard40.610
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard19.940
- acc_norm on GPQA (0-shot)Open LLM Leaderboard11.410
- acc_norm on MuSR (0-shot)Open LLM Leaderboard10.670
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard34.850