--- library_name: peft datasets: - qwedsacf/grade-school-math-instructions language: - en metrics: - perplexity --- ## Training procedure The following `bitsandbytes` quantization config was used during training: - load_in_8bit: True - load_in_4bit: False - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: fp4 - bnb_4bit_use_double_quant: False - bnb_4bit_compute_dtype: float32 ### Model Description For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/OPT%20Models/Grade%20School%20Math%20Instructions%20Fine-Tune%20OPT.ipynb. ### Intended uses & limitations This is intended to show the possibilities. It is mainly limited by the input data. ### Training & Evaluation Dataset Dataset Source: https://huggingface.co/datasets/qwedsacf/grade-school-math-instructions ### Hyperparameters Used | Hyperperameter | Value | |:-----:|:-----:| | Model Checkpoint | facebook/opt-2.7b | | per_device_train_batch_size | 4 | | gradient_accumulation_steps | 4 | | fp16 | True | | warmup_steps | 225 | | learning_rate | 2e-4 | | Training Steps | 450 ### Framework versions | Library | Version | |:-----:|:-----:| | Python | 3.10.1 | | Torch | 2.0.1+cu118 | | Datasets | 2.14.4 | | Transformer | 4.31.0 | | PEFT | 0.4.0 ### Metric Perplexity = 6.35