library_name: transformers | |
tags: | |
- PEFT | |
- mistral | |
- sft | |
- 'TensorBoard ' | |
- Safetensors | |
- ' trl' | |
- generated_from_trainer 4-bit | |
- ' precision' | |
license: mit | |
datasets: | |
- yahma/alpaca-cleaned | |
language: | |
- en | |
pipeline_tag: question-answering | |
# Model Card for Model ID | |
<!-- Provide a quick summary of what the model is/does. --> | |
This Model is Finetuned for Document Question and Answering purpose Trained on the yahma/alpaca-cleaned(https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) dataset. | |
## Model Details | |
### Training hyperparameters | |
The following hyperparameters were used during training: | |
-gradient_accumulation_steps=1, | |
-warmup_steps=5, | |
-max_steps=20, | |
-learning_rate=2e-4, | |
-fp16=not torch.cuda.is_bf16_supported(), | |
-bf16=torch.cuda.is_bf16_supported(), | |
-logging_steps=1, | |
-optim="adamw_8bit", | |
-weight_decay=0.01, | |
-lr_scheduler_type="linear", | |
-seed=3407, | |
- ### Framework versions | |
- PEFT 0.7.1 | |
- Transformers 4.36.0 | |
- Pytorch 2.0.0 | |
- Datasets 2.16.1 | |
- Tokenizers 0.15.0 | |