Zephyr-7b-QnA / README.md
Feluda's picture
Updated Model description
0127919 verified
---
library_name: transformers
tags:
- PEFT
- mistral
- sft
- 'TensorBoard '
- Safetensors
- ' trl'
- generated_from_trainer 4-bit
- ' precision'
license: mit
datasets:
- yahma/alpaca-cleaned
language:
- en
pipeline_tag: question-answering
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This Model is Finetuned for Document Question and Answering purpose Trained on the yahma/alpaca-cleaned(https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) dataset.
## Model Details
### Training hyperparameters
The following hyperparameters were used during training:
-gradient_accumulation_steps=1,
-warmup_steps=5,
-max_steps=20,
-learning_rate=2e-4,
-fp16=not torch.cuda.is_bf16_supported(),
-bf16=torch.cuda.is_bf16_supported(),
-logging_steps=1,
-optim="adamw_8bit",
-weight_decay=0.01,
-lr_scheduler_type="linear",
-seed=3407,
- ### Framework versions
- PEFT 0.7.1
- Transformers 4.36.0
- Pytorch 2.0.0
- Datasets 2.16.1
- Tokenizers 0.15.0