Stellar Odyssey 12b v0.0

Join my dream, it's just the right time, whoa... Leave it all behind... Get ready now... Riise up into my world~

Listen to the song on youtube: https://www.youtube.com/watch?v=npyiiInMA0w

This is my second attempt at a model merge, This time, these models were used

  • mistralai/Mistral-Nemo-Base-2407
  • Sao10K/MN-12B-Lyra-v4
  • nothingiisreal/MN-12B-Starcannon-v2
  • Gryphe/Pantheon-RP-1.5-12b-Nemo

License for this model is: Apache 2.0 (due to the base model, Mistral Nemo Base 2407)

Intended Use case: Roleplay

Instruction Format: ChatML

Thank you to AuriAetherwiing for helping me merge the models.

Data?

This is a hard question to answer, I didn't add any data to the model itself, rather it's a merge of other models, so the data used for them applies to this model too, though it won't be the same.

Merge Details

This model was merged using the della_linear merge method using mistralai/Mistral-Nemo-Base-2407 as a base.

Models Merged

The following models were included in the merge:

  • Sao10K/MN-12B-Lyra-v4
  • Gryphe/Pantheon-RP-1.5-12b-Nemo
  • nothingiisreal/MN-12B-Starcannon-v2

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: C:\Users\\Downloads\Mergekit-Fixed\mergekit\Sao10K_MN-12B-Lyra-v4
    parameters:
      weight: 0.3
      density: 0.25
  - model: C:\Users\\Downloads\Mergekit-Fixed\mergekit\nothingiisreal_MN-12B-Starcannon-v2
    parameters:
      weight: 0.1
      density: 0.4
  - model: C:\Users\\Downloads\Mergekit-Fixed\mergekit\Gryphe_Pantheon-RP-1.5-12b-Nemo
    parameters:
      weight: 0.4
      density: 0.5
merge_method: della_linear
base_model: C:\Users\\Downloads\Mergekit-Fixed\mergekit\mistralai_Mistral-Nemo-Base-2407
parameters:
  epsilon: 0.05
  lambda: 1
merge_method: della_linear
dtype: bfloat16

Notes

Della_Linear: Refer to https://arxiv.org/abs/2406.11617 and https://arxiv.org/abs/2212.04089, as it is quite long to explain what Della_Linear is BFloat16: Brain Floating Point 16, a way to run models faster on Nvidia GPUs Density: Fraction of weights in differences from the base model to retain Epsilon: Maximum change in drop probability based on magnitude. Drop probabilities assigned will range

Downloads last month
28
Safetensors
Model size
12.2B params
Tensor type
BF16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ProdeusUnity/Stellar-Odyssey-12b-v0.0

Merges
1 model
Quantizations
11 models

Spaces using ProdeusUnity/Stellar-Odyssey-12b-v0.0 3

Collection including ProdeusUnity/Stellar-Odyssey-12b-v0.0