winddude commited on
Commit
45a0276
·
1 Parent(s): 99994cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -21,7 +21,9 @@ This lora was trained on 250k post and response pairs from 43 different fincial,
21
 
22
  ## Training Details
23
 
24
- 1 note worthy change I will mention now, is this was trained with casualLM rather than seq2seq like a number of the other instruct models have been. I can't explain why they used seq2seq for data collators, other than that's what alpaca lora originally used. Llama as a generative model was trained for casualLM so to me it makes sense to use that when fine tuning.
 
 
25
 
26
  * More coming soon.
27
 
 
21
 
22
  ## Training Details
23
 
24
+ * Training took ~30hrs on 5x3090s and used almost 23gb of vram on each. DDP was used for pytorch parallelism.
25
+
26
+ * 1 note worthy change I will mention now, is this was trained with casualLM rather than seq2seq like a number of the other instruct models have been. I can't explain why they used seq2seq for data collators, other than that's what alpaca lora originally used. Llama as a generative model was trained for casualLM so to me it makes sense to use that when fine tuning.
27
 
28
  * More coming soon.
29