zarakiquemparte commited on
Commit
1266055
·
1 Parent(s): bed6b0f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - llama-2
5
+ ---
6
+ # Model Card: Pygmalion LRP Grad L2 7B
7
+ This model uses [Pygmalion 2 7B](https://huggingface.co/PygmalionAI/pygmalion-2-7b) as a base and merged with LimaRP(52%) Lora original from [Suikamelon](https://huggingface.co/lemonilia) customized with Metharme format
8
+
9
+ This merge of Lora with Model was done with this [script](https://github.com/zarakiquemparte/zaraki-tools/blob/main/apply-lora-weight-ltl.py)
10
+
11
+ - Credits to [Suikamelon](https://huggingface.co/lemonilia) for the LimaRP dataset
12
+ - Credits to [Pygmalion AI](https://huggingface.co/PygmalionAI) for the base model
13
+
14
+
15
+ ## Weights of Lora merge:
16
+
17
+ ```
18
+ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
19
+ ```
20
+
21
+ ## Prompting
22
+
23
+ The model has been trained on prompts using three different roles, which are denoted by the following tokens: `<|system|>`, `<|user|>` and `<|model|>`.
24
+
25
+ The `<|system|>` prompt can be used to inject out-of-channel information behind the scenes, while the `<|user|>` prompt should be used to indicate user input.
26
+ The `<|model|>` token should then be used to indicate that the model should generate a response. These tokens can happen multiple times and be chained up to
27
+ form a conversation history.
28
+
29
+ ### Prompting example
30
+
31
+ The system prompt has been designed to allow the model to "enter" various modes and dictate the reply length. Here's an example:
32
+
33
+ ```
34
+ <|system|>Enter RP mode. Pretend to be {{char}} whose persona follows:
35
+ {{persona}}
36
+
37
+ You shall reply to the user while staying in character, and generate long responses.
38
+ ```
39
+
40
+ ## Bias, Risks, and Limitations
41
+
42
+ The intended use-case for this model is fictional writing for entertainment purposes. Any other sort of usage is out of scope.
43
+
44
+ As such, it was **not** fine-tuned to be safe and harmless: the base model _and_ this fine-tune have been trained on data known to contain profanity and texts that
45
+ are lewd or otherwise offensive. It may produce socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.
46
+ Outputs might often be factually wrong or misleading.
47
+
48
+ ## Training Details
49
+
50
+ This model use LimaRP by [Suikamelon](https://huggingface.co/lemonilia) converted to metharme prompt format
51
+ This model is merged and can be reproduced using the tools mentioned above. Please refer to all provided links for extra model-specific details.
52
+
53
+ ## Training Hyperparameters
54
+
55
+ ```
56
+ load_in_8bit: true
57
+ adapter: lora
58
+ lora_r: 8
59
+ lora_alpha: 16
60
+ lora_dropout: 0.01
61
+ gradient_accumulation_steps: 1
62
+ micro_batch_size: 1
63
+ num_epochs: 3
64
+ optimizer: adamw_torch
65
+ lr_scheduler: cosine
66
+ learning_rate: 0.000065
67
+ bf16: true
68
+ tf32: true
69
+ ```
70
+
71
+ ## Environmental Impact
72
+ Finetuning the LimaRP Lora on 1 x NVIDIA L40 takes about 1h45m