Pclanglais commited on
Commit
c4ba05e
·
verified ·
1 Parent(s): c215d89

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md CHANGED
@@ -1,3 +1,90 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ ***Jambert*** is an experimental Jamba model fine-tuned for RAG tasks and document synthesis.
5
+
6
+ Given a question and a list of references, Jambert will write a summarized version.
7
+
8
+ As an initial test, Jambert is for now trained on a 4,096 token context window but with the expectations of doing later iteration on significantly longer texts, thanks to the Mamba architecture.
9
+
10
+ ## Training.
11
+ Jambert was trained with Axolotl on a set of administrative documents and associated synthesis in French and English. It could work out as well in other languages, as this task has been proven to transfer easily accross languages.
12
+
13
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
14
+ <details><summary>See axolotl config</summary>
15
+
16
+ axolotl version: `0.4.0`
17
+ ```yaml
18
+
19
+ base_model: jamba
20
+ trust_remote_code: true
21
+
22
+ load_in_8bit: false
23
+ load_in_4bit: true
24
+ strict: false
25
+
26
+ datasets:
27
+ - path: rag_dataset.json
28
+ ds_type: json
29
+ type: sharegpt
30
+ conversation: chatml
31
+ dataset_prepared_path:
32
+ val_set_size: 0.01
33
+ output_dir: ./out
34
+
35
+ sequence_len: 6000
36
+ sample_packing: true
37
+ pad_to_sequence_len: false
38
+ eval_sample_packing: true
39
+
40
+ use_wandb: false
41
+
42
+ adapter: qlora
43
+ lora_r: 8
44
+ lora_alpha: 16
45
+ lora_dropout: 0.05
46
+ lora_target_linear: true
47
+
48
+ low_cpu_mem_usage: true
49
+ gradient_accumulation_steps: 4
50
+ micro_batch_size: 1
51
+ num_epochs: 2
52
+ optimizer: paged_adamw_8bit
53
+ lr_scheduler: cosine
54
+ learning_rate: 0.0002
55
+
56
+ train_on_inputs: false
57
+ group_by_length: false
58
+ bf16: auto
59
+ fp16:
60
+ tf32: false
61
+
62
+ gradient_checkpointing: true
63
+ gradient_checkpointing_kwargs:
64
+ use_reentrant: false
65
+ early_stopping_patience:
66
+ resume_from_checkpoint:
67
+ local_rank:
68
+ logging_steps: 1
69
+ xformers_attention:
70
+ flash_attention: true
71
+
72
+ warmup_steps: 10
73
+ evals_per_epoch: 2
74
+ saves_per_epoch: 2
75
+ debug:
76
+ weight_decay: 0.0
77
+ special_tokens:
78
+
79
+ ```
80
+
81
+ </details><br>
82
+
83
+ ## Inference.
84
+ The repository provides both a 4-bit version that should run easily on any 80b or even 40b GPU, as well as the original adapter to be used in combination with the base model.
85
+
86
+ Inference was tested with the following script:
87
+
88
+
89
+
90
+