File size: 1,432 Bytes
10845e7
 
a28194c
 
 
 
 
 
10845e7
 
 
 
 
 
 
 
 
 
 
 
 
 
cd55979
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10845e7
cd55979
 
 
 
 
 
 
 
 
10845e7
cd55979
10845e7
cd55979
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
library_name: peft
datasets:
- qwedsacf/grade-school-math-instructions
language:
- en
metrics:
- perplexity
---
## Training procedure


The following `bitsandbytes` quantization config was used during training:
- load_in_8bit: True
- load_in_4bit: False
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32

### Model Description

For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/OPT%20Models/Grade%20School%20Math%20Instructions%20Fine-Tune%20OPT.ipynb.

### Intended uses & limitations

This is intended to show the possibilities. It is mainly limited by the input data.

### Training & Evaluation Dataset

Dataset Source: https://huggingface.co/datasets/qwedsacf/grade-school-math-instructions

### Hyperparameters Used

| Hyperperameter | Value |
|:-----:|:-----:|
| Model Checkpoint | facebook/opt-2.7b |
| per_device_train_batch_size | 4 |
| gradient_accumulation_steps | 4 |
| fp16 | True |
| warmup_steps | 225 |
| learning_rate | 2e-4 |
| Training Steps | 450


### Framework versions
    
| Library | Version |
|:-----:|:-----:|
| Python | 3.10.1 |
| Torch | 2.0.1+cu118 |
| Datasets | 2.14.4 |
| Transformer | 4.31.0 |
| PEFT | 0.4.0 


### Metric

Perplexity = 6.35