alac commited on
Commit
5df80f3
·
1 Parent(s): 998415c

add lora and readme.md

Browse files
README.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - llama-2
6
+ - instruct
7
+ - instruction
8
+ - writing
9
+ - story
10
+ pipeline_tag: text-generation
11
+ license: other
12
+ ---
13
+
14
+ huggingface-cli repo create Waxwing-Storytelling-70B-LoRA --type model, dataset, space
15
+
16
+ # Waxwing-Storytelling-70B-LoRA model card
17
+
18
+ Waxwing is a storytelling lora for Llama 2 70B.
19
+ - Guide the story with Waxwing's turn-based instruction system.
20
+ - Tailor the feel of your story using style tags.
21
+ - Experience storytelling free of ChatGPT's idiosyncrasies, thanks to a "human-generated" dataset of public domain writing. Waxwing avoids GPT-isms like positivity bias, "bond" emphasis, rushed endings and exaggerated stylistic tics.
22
+
23
+ Waxwing is available:
24
+ - LoRA: As a LoRA on this branch and can be applied at runtime on any variant of the Llama 2 70B base model.
25
+ - 16fp model: Merged into the base Llama 2 model, in full precision in the [16fp](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/16fp) branch.
26
+ - Quantized for used with Exllama 2:
27
+ - [2.5bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/2.5bpw)
28
+ - [3.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/3.0bpw)
29
+ - [4.65bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/4.65bpw)
30
+ - [6.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/6.0bpw)
31
+ - [8.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/8.0bpw)
32
+
33
+ By using this model, you take full responsibility for anything done with its outputs.
34
+
35
+
36
+ ## Model Details
37
+
38
+ ### Model Description
39
+
40
+ - **Developed by:** alac
41
+ - **Model Type:** QLoRA
42
+ - **Finetuned from model:** Llama-2 70B
43
+ - **Language(s):** English
44
+
45
+
46
+ ### Dataset
47
+
48
+ Waxwing was trained with a small dataset gathered from public domain writing. The exact dataset will remain private, but the code used to generate prompts and metadata is available on [github](https://github.com/alac/txt_to_dataset).
49
+ Upstage's [SOLAR](https://huggingface.co/upstage/SOLAR-0-70b-16bit) model was used to tag the dataset.
50
+
51
+
52
+ ### Prompt Template
53
+
54
+ ```
55
+ ### System:
56
+ A chat between a user and a writing assistant.
57
+ {context}
58
+
59
+ ### User:
60
+ {style tags}
61
+ Write a scene where: {events that should happen in the next scene}
62
+
63
+ ### Assistant:
64
+ {output}
65
+ ```
66
+ `context` is an optional story synopsis.
67
+ `style tags` should be a string along the lines of:
68
+ ```
69
+ Tone: {list of tones}. Writing style: {list of writing styles}.
70
+ Written with {slow|medium|fast} pacing, in moment to moment detail, in {abstract|selective|vivid sensory} detail, from a {First|Third Person (Character)} perspective.
71
+ ```
72
+ The exact values it was trained on are in the `dataset_tags.json` file. Anecdotally, it works better with a subset of the style tags used (`Tone: tense`) or with tags that are complementary (`Tone: tense, mysterious. Writing style: dramatic. Written in abstract detail.`). It's unclear how well Waxwing responds to tags that it was not trained on (e.g. 'genre').
73
+
74
+ For SillyTavern users, the `style tags` work well in the "Author's Note" field at depth 1. User messages should begin with `Write a scene where: `; to continue a scene, just type `continue`. Most testing was done using the [Genesis](https://github.com/SillyTavern/SillyTavern/blob/8e73882c9ba7301c9163befbe445686a79d4a9a8/public/TextGen%20Settings/NovelAI%20(Genesis).settings) preset.
75
+
76
+
77
+ ### Training
78
+
79
+ Waxwing was trained on a single machine with 72GB of VRAM. The training parameters are available in the `training_parameters.json` file of the main branch. The software used to train was FartyPants' [Training_PRO](https://github.com/FartyPants/Training_PRO) extension for the Oobabooga Text Generation WebUI.
adapter_config.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "models\\Llama-2-70B-fp16",
5
+ "bias": "none",
6
+ "fan_in_fan_out": false,
7
+ "inference_mode": true,
8
+ "init_lora_weights": true,
9
+ "layers_pattern": null,
10
+ "layers_to_transform": null,
11
+ "lora_alpha": 8,
12
+ "lora_dropout": 0.05,
13
+ "modules_to_save": null,
14
+ "peft_type": "LORA",
15
+ "r": 16,
16
+ "rank_pattern": {},
17
+ "revision": null,
18
+ "target_modules": [
19
+ "q_proj",
20
+ "down_proj",
21
+ "o_proj",
22
+ "v_proj",
23
+ "gate_proj",
24
+ "k_proj",
25
+ "up_proj"
26
+ ],
27
+ "task_type": "CAUSAL_LM"
28
+ }
adapter_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b9eadd7f9239cb51f29e3896bca94d02c7639b8ac39b35ba613f33d2a13653d4
3
+ size 828780162
dataset_tags.json ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "tone": [
3
+ "melancholic",
4
+ "tense",
5
+ "dramatic",
6
+ "suspenseful",
7
+ "mysterious",
8
+ "humorous",
9
+ "somber",
10
+ "philosophical",
11
+ "bittersweet",
12
+ "ominous",
13
+ "whimsical",
14
+ "determined",
15
+ "sarcastic",
16
+ "intense",
17
+ "nostalgic",
18
+ "dark",
19
+ "adventurous",
20
+ "serious",
21
+ "emotional",
22
+ "surreal",
23
+ "thoughtful",
24
+ "ironic",
25
+ "cynical",
26
+ "desperate",
27
+ "absurd",
28
+ "uncertain",
29
+ "wry",
30
+ "resigned",
31
+ "intriguing",
32
+ "curious",
33
+ "anxious",
34
+ "hopeful",
35
+ "eerie",
36
+ "romantic",
37
+ "comedic",
38
+ "thrilling",
39
+ "action-packed"
40
+ ],
41
+ "writing style": [
42
+ "descriptive",
43
+ "detailed",
44
+ "poetic",
45
+ "vivid",
46
+ "imaginative",
47
+ "introspective",
48
+ "emotional",
49
+ "flowery",
50
+ "philosophical",
51
+ "conversational",
52
+ "formal",
53
+ "immersive",
54
+ "character-driven",
55
+ "fast-paced",
56
+ "evocative",
57
+ "dramatic",
58
+ "present tense",
59
+ "witty",
60
+ "dialogue-heavy",
61
+ "lyrical",
62
+ "narrative",
63
+ "atmospheric",
64
+ "analytical"
65
+ ],
66
+ "pacing": [
67
+ "medium",
68
+ "slow",
69
+ "fast"
70
+ ],
71
+ "sensory detail": [
72
+ "abstract",
73
+ "selective",
74
+ "vivid sensory"
75
+ ]
76
+ }
training_parameters.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "lora_name": "Waxwing",
3
+ "always_override": false,
4
+ "save_steps": 0.0,
5
+ "micro_batch_size": 3,
6
+ "batch_size": 0,
7
+ "epochs": 1.0,
8
+ "learning_rate": "3e-4",
9
+ "lr_scheduler_type": "linear",
10
+ "lora_rank": 16,
11
+ "lora_alpha": 8,
12
+ "lora_dropout": 0.05,
13
+ "cutoff_len": 1280,
14
+ "dataset": "dataset_11.27.23",
15
+ "eval_dataset": "None",
16
+ "format": "t2d_oobabooga_training_format",
17
+ "eval_steps": 100.0,
18
+ "raw_text_file": "None",
19
+ "higher_rank_limit": false,
20
+ "warmup_steps": 100.0,
21
+ "optimizer": "adamw_torch_fused",
22
+ "hard_cut_string": "\\n\\n\\n",
23
+ "train_only_after": "",
24
+ "stop_at_loss": 0,
25
+ "add_eos_token": false,
26
+ "min_chars": 0.0,
27
+ "report_to": "None",
28
+ "precize_slicing_overlap": true,
29
+ "add_eos_token_type": "Every Block",
30
+ "save_steps_under_loss": 1.8,
31
+ "add_bos_token": false,
32
+ "training_projection": "all",
33
+ "sliding_window": false,
34
+ "warmup_ratio": 0,
35
+ "grad_accumulation": 4,
36
+ "neft_noise_alpha": 3
37
+ }