alnrg2arg commited on
Commit
13aa547
·
verified ·
1 Parent(s): deb2400

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -7
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  language:
3
  - en
4
- license: apache-2.0
5
  tags:
6
  - text-generation-inference
7
  - transformers
@@ -9,14 +9,50 @@ tags:
9
  - mistral
10
  - trl
11
  base_model: alnrg2arg/blockchainlabs_7B_merged_test2_4
 
 
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** alnrg2arg
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** alnrg2arg/blockchainlabs_7B_merged_test2_4
19
 
20
- This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
  - en
4
+ license: cc-by-nc-4.0
5
  tags:
6
  - text-generation-inference
7
  - transformers
 
9
  - mistral
10
  - trl
11
  base_model: alnrg2arg/blockchainlabs_7B_merged_test2_4
12
+ datasets:
13
+ - Intel/orca_dpo_pairs
14
  ---
15
 
16
+ This is a model from blockchainlab test 2.4 - alnrg2arg/blockchainlabs_7B_merged_test2_4.
17
 
18
+ The project is running to make a small LLM for a on-device purpose.
 
 
19
 
20
+ Overall pipeline for this iteration is
21
 
22
+ 1.Merging to make a base model (7B) 2.Prune the model to reduce the parameter (50% sparcity) 3.For recovery phase of the pruning, the DPO is chosen.
23
+
24
+ This model which is not pruned is intended to compare with the pruned model.
25
+
26
+ This is the code and parameters I chose for this model(DPO).
27
+ ```
28
+ from transformers import TrainingArguments, AutoModelForCausalLM
29
+ from trl import DPOTrainer
30
+
31
+ dpo_trainer = DPOTrainer(
32
+ model = model,
33
+
34
+ ref_model = None,
35
+ args = TrainingArguments(
36
+ per_device_train_batch_size = 8,
37
+ gradient_accumulation_steps = 8,
38
+ warmup_ratio = 0.1,
39
+ num_train_epochs = 3,
40
+ learning_rate = 5e-6,
41
+ fp16 = not torch.cuda.is_bf16_supported(),
42
+ bf16 = torch.cuda.is_bf16_supported(),
43
+ logging_steps = 1,
44
+ optim = "adamw_8bit",
45
+ weight_decay = 0.0,
46
+ lr_scheduler_type = "linear",
47
+ seed = 42,
48
+ output_dir = "output_DPO",
49
+ ),
50
+ beta = 0.1,
51
+ train_dataset = dataset,
52
+ # eval_dataset = raw_datasets["test"],
53
+ tokenizer = tokenizer,
54
+ max_length = 1024,
55
+ max_prompt_length = 512,
56
+ )
57
+ ```
58
+ The code and parameters are borrowed from https://colab.research.google.com/drive/1SKrKGV-BZoU4kv5q3g0jtE_OhRgPtrrQ?usp=sharing