schonsense
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,10 @@ tags:
|
|
9 |
---
|
10 |
# flam-kit
|
11 |
|
|
|
|
|
|
|
|
|
12 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
13 |
|
14 |
## Merge Details
|
|
|
9 |
---
|
10 |
# flam-kit
|
11 |
|
12 |
+
Building off of the success of my last merge, I identified some areas of weakness that I perceived in the resulting model and this model is an attempt to address them.
|
13 |
+
|
14 |
+
I created a wild model with a della merge (schonsense/Flamlama_70B_della), using models that all had strengths I was hoping to incorporate into my existing model SLERP. To create this new model (schonsense/flam-kit), I SLERP merged my previous model (schonsense/Llama-3.3-70B-Inst-Ablit-Flammades-SLERP) as a base and gently brought in the wild flavors of the della merged model. The intent being to keep the instruct following and proper model function without having to stomp on it with excessive sampling parameters, while changing the voice and capabilities of the model. I believe this model to finally be a success, after a number of failures. I find (schonsense/flam-kit) to be superior in most respects to (schonsense/Llama-3.3-70B-Inst-Ablit-Flammades-SLERP), requiring only the most modest of sampling parameters to function well.
|
15 |
+
|
16 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
17 |
|
18 |
## Merge Details
|