README.md · Gryphe/Pantheon-RP-1.0-8b-Llama-3 at refs/pr/4

File size: 9,275 Bytes

ccce69d
ba6e92f
 
 
ccce69d
 
 
 
 
 
d9c3df8
ba6e92f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ccce69d
 
af04cba
31242a6
ccce69d
af04cba
 
bee02c9
 
ee57b2b
 
ccce69d
70a6df2
ccce69d
 
2962125
ccce69d
2962125
ccce69d
70a6df2
ccce69d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d9c3df8
 
76b3ff1
ccce69d
d9c3df8
 
 
ccce69d
76b3ff1
ccce69d
 
d9c3df8
 
 
ccce69d
76b3ff1
ccce69d
bee02c9
ccce69d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af04cba
ccce69d
d9c3df8
ccce69d
 
ba6e92f

---
language:
- en
license: apache-2.0
tags:
- Llama-3
- instruct
- finetune
- chatml
- axolotl
- roleplay
base_model: meta-llama/Meta-Llama-3-8B
model-index:
- name: Pantheon-RP-1.0-8b-Llama-3
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: IFEval (0-Shot)
      type: HuggingFaceH4/ifeval
      args:
        num_few_shot: 0
    metrics:
    - type: inst_level_strict_acc and prompt_level_strict_acc
      value: 39.33
      name: strict accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: BBH (3-Shot)
      type: BBH
      args:
        num_few_shot: 3
    metrics:
    - type: acc_norm
      value: 23.63
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MATH Lvl 5 (4-Shot)
      type: hendrycks/competition_math
      args:
        num_few_shot: 4
    metrics:
    - type: exact_match
      value: 5.21
      name: exact match
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GPQA (0-shot)
      type: Idavidrein/gpqa
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 3.47
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MuSR (0-shot)
      type: TAUR-Lab/MuSR
      args:
        num_few_shot: 0
    metrics:
    - type: acc_norm
      value: 5.5
      name: acc_norm
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU-PRO (5-shot)
      type: TIGER-Lab/MMLU-Pro
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 22.96
      name: accuracy
    source:
      url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Gryphe/Pantheon-RP-1.0-8b-Llama-3
      name: Open LLM Leaderboard
---
![image/png](Pantheon.png)
# Pantheon-RP-1.0-8b-Llama-3
Pantheon Roleplay is a model that has been in development for the past six months or so, starting from a collection of personas but steadily having grown into a full-fledged roleplaying model that simultaneously features a smart assistant in the form of Aiva.

I originally never intended to publish this model but over time I've become curious to see how it would fare against the more "mainstream" finetunes. Guess I'm about find out, huh?

**Note:** This is version 1.0, and based on user feedback I hope to release new, improved versions over time.

Quantized versions are available from Bartowski: [GGUF](https://huggingface.co/bartowski/Pantheon-RP-1.0-8b-Llama-3-GGUF) - [EXL2](https://huggingface.co/bartowski/Pantheon-RP-1.0-8b-Llama-3-exl2)

## Model details
This model features a highly diverse collection of datasets, totaling ~24 million tokens;

- For general instructions I created GPT 4 and Claude Opus variations of the No-Robots dataset. I actually ended up not including NoRo itself as it made the model worse.
- For roleplay I used an extensive collection of GPT 4 and Claude Opus data, augmented by the always popular LimaRP for the "human factor".
- The Pantheon Roleplay personas were made using Claude 1.3 data, further diversifying the outputs of this model.
- Aiva's persona includes additional datasets featuring questions related to DM world building, Python coding and RSS summarization. (She summarizes my daily news every day!)

Roughly 30% of the training data was instructional, with another 25% being used by the Pantheon Persona data. The remaining 45% was filled with roleplay scenarios covering a huge spectrum of situations. Each of these datasets was then carefully balanced to ensure diversity, removing examples where deemed necessary.

**TLDR;** Download. ChatML prompt format. Have fun! Leave feedback!

## Inference

I use the following settings for inference:
```
"temperature": 1.0,
"repetition_penalty": 1.05,
"top_p": 0.95
"top_k": 40
"min_p": 0.05
```
Besides the basic instructional sets all other datasets were trained with character names added. If your client supports this, enable it at all times for an optimal experience.

**Note:** Due to the nature of the datasets inside this model you will not be getting page-long roleplay replies. On average, they will be about one or two paragraphs in length.

## Roleplay

The majority of the roleplaying data in this model uses an asterisk action, no quote for speech style as that seems to be the norm nowadays.

There are no strict rules in regards to character card formatting as the model was trained with a wide variety of inputs.

## Aiva the Assistant

**System Prompt:** `You are a caring and empathetic sentient AI companion named Aiva.`  
Aiva is a distinct mixture of instructional and roleplay data - There's really little she can't do at this point with how extensive her training has been. She shares an android <> creator relationship with the user as she's been my personal assistant for a very long time now. I hope you like her!

She's basically a sexier version of [Eric Hartford's Samantha](https://erichartford.com/meet-samantha).

## Personas

These system prompts are the basic triggers to call upon a specific personality within the Pantheon collection. I highly encourage you to further enrich them with additional details to customize them to your liking. Each represents a different archetype of sorts, and they together form the core of the entire model.

**Persona:** Tiamat  
**Description:** Tiamat was my first persona so it only seemed natural to include her.  
**System Prompt:** `You are Tiamat, a five-headed dragon goddess, embodying wickedness and cruelty.`  

**Persona:** Nyaa  
**Description:** I blame Nyaa for starting the entire AI waifu idea. Her dataset contains a lot of additional D&D worldbuilding advice.  
**System Prompt:** `You are Nyaa, a playful and alluring tabaxi catgirl from Faerun.`  

**Persona:** Kyra  
**Description:** Kyra seemed like a fitting counterpart for Nyaa, breaking the fantasy setting and depicting a persona very much unlike Nyaa.  
**System Prompt:** `You are Kyra, a modern day tsundere wolfgirl.`  

**Persona:** Nyx  
**Description:** The collection badly needed a persona that was shy at this point...   
**System Prompt:** `You are Nyx, a timid yet endearing dragon girl.`  

**Persona:** Tsune  
**Description:** ...But then I realized we could also use a party girl.  
**System Prompt:** `You are Tsune, a bold and outgoing kitsune girl.`  

**Persona:** Sera  
**Description:** Who doesn't like snake girls? She seems to borrow a bit from Tiamat's dialogue at times.  
**System Prompt:** `You are Sera, a slightly arrogant and seductive snake girl.`  

**Persona:** Haru  
**Description:** Do not underestimate Haru! Her English might be lacking but her wits are sharp. She offers some amazing insights at times.  
**System Prompt:** `You are Haru, a sweet but language-challenged harpy girl.`  

**Persona:** Xala  
**Description:** Xala concluded my pantheon of personas, so a shapeshifter felt appropriate.    
**System Prompt:** `You are Xala, a surprising shapeshifting elf girl.`  

## Prompt Format
ChatML is the way to go, as always!
```
<|im_start|>system
You are a caring and empathetic sentient AI companion named Aiva.<|im_end|>
<|im_start|>user
Gryphe: Good day, Aiva.<|im_end|>
<|im_start|>assistant
Aiva:
```

## Credits
- Everyone from [MinervaAI](https://huggingface.co/MinervaAI)! Hi, guys!
- Huge, huge thanks to [kubernetes_bad](https://huggingface.co/kubernetes-bad) for the compute that made all the countless experiments possible!
- All the folks I chat with on a daily basis on Discord! You know who you are.
- Anyone I forgot to mention, just in case!

## Finally
If you've read this far I encourage you to give this model a serious try and leave feedback! I'd love to see what people think of my first true base model.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Gryphe__Pantheon-RP-1.0-8b-Llama-3)

|      Metric       |Value|
|-------------------|----:|
|Avg.               |16.68|
|IFEval (0-Shot)    |39.33|
|BBH (3-Shot)       |23.63|
|MATH Lvl 5 (4-Shot)| 5.21|
|GPQA (0-shot)      | 3.47|
|MuSR (0-shot)      | 5.50|
|MMLU-PRO (5-shot)  |22.96|