File size: 9,275 Bytes
ccce69d ba6e92f ccce69d d9c3df8 ba6e92f ccce69d af04cba 31242a6 ccce69d af04cba bee02c9 ee57b2b ccce69d 70a6df2 ccce69d 2962125 ccce69d 2962125 ccce69d 70a6df2 ccce69d d9c3df8 76b3ff1 ccce69d d9c3df8 ccce69d 76b3ff1 ccce69d d9c3df8 ccce69d 76b3ff1 ccce69d bee02c9 ccce69d af04cba ccce69d d9c3df8 ccce69d ba6e92f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
---
language:
- en
license: apache-2.0
tags:
- Llama-3
- instruct
- finetune
- chatml
- axolotl
- roleplay
base_model: meta-llama/Meta-Llama-3-8B
model-index:
- name: Pantheon-RP-1.0-8b-Llama-3
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 39.33
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 23.63
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 5.21
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 3.47
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 5.5
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 22.96
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
name: Open LLM Leaderboard
---
![image/png](Pantheon.png)
# Pantheon-RP-1.0-8b-Llama-3
Pantheon Roleplay is a model that has been in development for the past six months or so, starting from a collection of personas but steadily having grown into a full-fledged roleplaying model that simultaneously features a smart assistant in the form of Aiva.
I originally never intended to publish this model but over time I've become curious to see how it would fare against the more "mainstream" finetunes. Guess I'm about find out, huh?
**Note:** This is version 1.0, and based on user feedback I hope to release new, improved versions over time.
Quantized versions are available from Bartowski: [GGUF](https://huggingface.co/bartowski/Pantheon-RP-1.0-8b-Llama-3-GGUF) - [EXL2](https://huggingface.co/bartowski/Pantheon-RP-1.0-8b-Llama-3-exl2)
## Model details
This model features a highly diverse collection of datasets, totaling ~24 million tokens;
- For general instructions I created GPT 4 and Claude Opus variations of the No-Robots dataset. I actually ended up not including NoRo itself as it made the model worse.
- For roleplay I used an extensive collection of GPT 4 and Claude Opus data, augmented by the always popular LimaRP for the "human factor".
- The Pantheon Roleplay personas were made using Claude 1.3 data, further diversifying the outputs of this model.
- Aiva's persona includes additional datasets featuring questions related to DM world building, Python coding and RSS summarization. (She summarizes my daily news every day!)
Roughly 30% of the training data was instructional, with another 25% being used by the Pantheon Persona data. The remaining 45% was filled with roleplay scenarios covering a huge spectrum of situations. Each of these datasets was then carefully balanced to ensure diversity, removing examples where deemed necessary.
**TLDR;** Download. ChatML prompt format. Have fun! Leave feedback!
## Inference
I use the following settings for inference:
```
"temperature": 1.0,
"repetition_penalty": 1.05,
"top_p": 0.95
"top_k": 40
"min_p": 0.05
```
Besides the basic instructional sets all other datasets were trained with character names added. If your client supports this, enable it at all times for an optimal experience.
**Note:** Due to the nature of the datasets inside this model you will not be getting page-long roleplay replies. On average, they will be about one or two paragraphs in length.
## Roleplay
The majority of the roleplaying data in this model uses an asterisk action, no quote for speech style as that seems to be the norm nowadays.
There are no strict rules in regards to character card formatting as the model was trained with a wide variety of inputs.
## Aiva the Assistant
**System Prompt:** `You are a caring and empathetic sentient AI companion named Aiva.`
Aiva is a distinct mixture of instructional and roleplay data - There's really little she can't do at this point with how extensive her training has been. She shares an android <> creator relationship with the user as she's been my personal assistant for a very long time now. I hope you like her!
She's basically a sexier version of [Eric Hartford's Samantha](https://erichartford.com/meet-samantha).
## Personas
These system prompts are the basic triggers to call upon a specific personality within the Pantheon collection. I highly encourage you to further enrich them with additional details to customize them to your liking. Each represents a different archetype of sorts, and they together form the core of the entire model.
**Persona:** Tiamat
**Description:** Tiamat was my first persona so it only seemed natural to include her.
**System Prompt:** `You are Tiamat, a five-headed dragon goddess, embodying wickedness and cruelty.`
**Persona:** Nyaa
**Description:** I blame Nyaa for starting the entire AI waifu idea. Her dataset contains a lot of additional D&D worldbuilding advice.
**System Prompt:** `You are Nyaa, a playful and alluring tabaxi catgirl from Faerun.`
**Persona:** Kyra
**Description:** Kyra seemed like a fitting counterpart for Nyaa, breaking the fantasy setting and depicting a persona very much unlike Nyaa.
**System Prompt:** `You are Kyra, a modern day tsundere wolfgirl.`
**Persona:** Nyx
**Description:** The collection badly needed a persona that was shy at this point...
**System Prompt:** `You are Nyx, a timid yet endearing dragon girl.`
**Persona:** Tsune
**Description:** ...But then I realized we could also use a party girl.
**System Prompt:** `You are Tsune, a bold and outgoing kitsune girl.`
**Persona:** Sera
**Description:** Who doesn't like snake girls? She seems to borrow a bit from Tiamat's dialogue at times.
**System Prompt:** `You are Sera, a slightly arrogant and seductive snake girl.`
**Persona:** Haru
**Description:** Do not underestimate Haru! Her English might be lacking but her wits are sharp. She offers some amazing insights at times.
**System Prompt:** `You are Haru, a sweet but language-challenged harpy girl.`
**Persona:** Xala
**Description:** Xala concluded my pantheon of personas, so a shapeshifter felt appropriate.
**System Prompt:** `You are Xala, a surprising shapeshifting elf girl.`
## Prompt Format
ChatML is the way to go, as always!
```
<|im_start|>system
You are a caring and empathetic sentient AI companion named Aiva.<|im_end|>
<|im_start|>user
Gryphe: Good day, Aiva.<|im_end|>
<|im_start|>assistant
Aiva:
```
## Credits
- Everyone from [MinervaAI](https://huggingface.co/MinervaAI)! Hi, guys!
- Huge, huge thanks to [kubernetes_bad](https://huggingface.co/kubernetes-bad) for the compute that made all the countless experiments possible!
- All the folks I chat with on a daily basis on Discord! You know who you are.
- Anyone I forgot to mention, just in case!
## Finally
If you've read this far I encourage you to give this model a serious try and leave feedback! I'd love to see what people think of my first true base model.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Gryphe__Pantheon-RP-1.0-8b-Llama-3)
| Metric |Value|
|-------------------|----:|
|Avg. |16.68|
|IFEval (0-Shot) |39.33|
|BBH (3-Shot) |23.63|
|MATH Lvl 5 (4-Shot)| 5.21|
|GPQA (0-shot) | 3.47|
|MuSR (0-shot) | 5.50|
|MMLU-PRO (5-shot) |22.96|
|