|
--- |
|
library_name: transformers |
|
base_model: |
|
- AI-Sweden-Models/gpt-sw3-6.7b-v2 |
|
datasets: |
|
- skvarre/sv-instruct-v1 |
|
--- |
|
|
|
## Model Details |
|
|
|
Finetune of [gpt-sw3-6.7b-v2](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-v2) using **LoRA** with 4-bit quantization. Adapters merged with base model, however with ```bfloat16``` precision tensors. |
|
|
|
## Usage: |
|
|
|
This is a finetune experiment. How to use will be provided later. |
|
|
|
## Evaluation |
|
[ScandEval](https://scandeval.com/swedish-nlg/) benchmarks: |
|
| Dataset | Performance (Metric 1 / Metric 2) | |
|
|:------------:|:------------------------------------:| |
|
| swerec | 74.95 ± 1.17 / 61.38 ± 1.37 | |
|
| suc3 | 30.75 ± 4.11 / 25.69 ± 4.83 | |
|
| scala-sv | 8.96 ± 2.09 / 51.50 ± 2.94 | |
|
| scandiqa-sv | 50.71 ± 0.99 / 56.76 ± 0.89 | |
|
| swedn | 64.37 ± 0.72 / 18.25 ± 0.29 | |
|
| mmlu-sv | 5.45 ± 0.91 / 28.14 ± 0.82 | |
|
| hellaswag-sv | 27.95 ± 0.73 / 4.19 ± 0.94 | |
|
| speed | 5322.20 ± 1132.75 / 1280.06 ± 408.08 | |
|
|
|
|
|
|