--- base_model: - Nitral-Archive/nightwing3-r64-2-latest_test-train-10B - Nitral-Archive/nightwing3-r64-1-latest_test-train-10B library_name: transformers tags: - mergekit - merge license: other language: - en --- # Noticed some weird behavior in 4bpw exl2, not sure if this is contained or a model related issue. However after seeing some recent bugfixes regarding the targetting of lm training heads among a few other things, i will be attemping to retrain this for comparitive sake. # Base model: (Falcon3-10B) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/C6gY9vxCl3_SFzQLpLG0S.png) # Prompt format: ChatML ``` <|im_start|>system {system_prompt}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant ``` ### The following YAML configuration was used to produce this model: (SLERP merge method) ```yaml slices: - sources: - model: Nitral-Archive/nightwing3-r64-1-latest_test-train-10B layer_range: [0, 40] - model: Nitral-Archive/nightwing3-r64-2-latest_test-train-10B layer_range: [0, 40] merge_method: slerp base_model: Nitral-Archive/nightwing3-r64-1-latest_test-train-10B parameters: t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.420 dtype: bfloat16 ```