---
library_name: transformers
license: llama3
language:
- ko
- en
pipeline_tag: text-generation
---

# davidkim205/ko-gemma-2-9b-it

davidkim205/ko-gemma-2-9b-it is one of several models being researched to improve the performance of Korean language models. 

(would be released soon)

## Model Details

* **Model Developers** :  davidkim(changyeon kim)
* **Repository** : - 
* **base mode** : google/gemma-2-9b-it
* **sft dataset** : qa_ability_1851.jsonl

## Benchmark

### kollm_evaluation
https://github.com/davidkim205/kollm_evaluation

|       Tasks       |Version|Filter|n-shot| Metric |Value |   |Stderr|
|-------------------|-------|------|-----:|--------|-----:|---|------|
|kobest             |N/A    |none  |     0|acc     |0.5150|±  |0.0073|
|                   |       |none  |     0|f1      |0.4494|±  |N/A   |
| - kobest_boolq    |      1|none  |     0|acc     |0.6154|±  |0.0130|
|                   |       |none  |     0|f1      |0.5595|±  |N/A   |
| - kobest_copa     |      1|none  |     0|acc     |0.4710|±  |0.0158|
|                   |       |none  |     0|f1      |0.4700|±  |N/A   |
| - kobest_hellaswag|      1|none  |     0|acc     |0.3880|±  |0.0218|
|                   |       |none  |     0|f1      |0.3832|±  |N/A   |
|                   |       |none  |     0|acc_norm|0.4780|±  |0.0224|
| - kobest_sentineg |      1|none  |     0|acc     |0.5189|±  |0.0251|
|                   |       |none  |     0|f1      |0.4773|±  |N/A   |
| - kobest_wic      |      1|none  |     0|acc     |0.4873|±  |0.0141|
|                   |       |none  |     0|f1      |0.3276|±  |N/A   |
|ko_truthfulqa      |      2|none  |     0|acc     |0.3390|±  |0.0166|
|ko_mmlu            |      1|none  |     0|acc     |0.1469|±  |0.0019|
|                   |       |none  |     0|acc_norm|0.1469|±  |0.0019|
|ko_hellaswag       |      1|none  |     0|acc     |0.2955|±  |0.0046|
|                   |       |none  |     0|acc_norm|0.3535|±  |0.0048|
|ko_common_gen      |      1|none  |     0|acc     |0.5825|±  |0.0126|
|                   |       |none  |     0|acc_norm|0.5825|±  |0.0126|
|ko_arc_easy        |      1|none  |     0|acc     |0.2329|±  |0.0124|
|                   |       |none  |     0|acc_norm|0.2867|±  |0.0132|


### Evaluation of KEval
keval is an evaluation model that learned the prompt and dataset used in the benchmark for evaluating Korean language models among various methods of evaluating models with chatgpt to compensate for the shortcomings of the existing lm-evaluation-harness.

https://huggingface.co/davidkim205/keval-7b

| model                                                                                    |   ned |   exe_time |   evalscore |   count |
|:-----------------------------------------------------------------------------------------|------:|-----------:|------------:|--------:|
| claude-3-opus-20240229                                                                   | nan   |      nan   |        8.79 |      42 |
| gpt-4-turbo-2024-04-09                                                                   | nan   |      nan   |        8.71 |      42 |
| Qwen2-72B-Instruct                                                                       | nan   |    29850.5 |        7.85 |      42 |
| WizardLM-2-8x22B                                                                         | nan   |   133831   |        7.57 |      42 |
| ***ko-gemma-2-9b-it***                                                                   | nan   |    30789.5 |        7.52 |      42 |
| HyperClovaX                                                                              | nan   |      nan   |        7.44 |      42 |
| gemma-2-9b-it                                                                            | nan   |    23531.7 |        7.4  |      42 |
| glm-4-9b-chat                                                                            | nan   |    24825.6 |        7.31 |      42 |
| Ko-Llama-3-8B-Instruct                                                                   | nan   |    10697.5 |        6.81 |      42 |
| Qwen2-7B-Instruct                                                                        | nan   |    11856.3 |        6.02 |      42 |
| Not-WizardLM-2-7B                                                                        | nan   |    12955.7 |        5.26 |      42 |
| gemma-1.1-7b-it                                                                          | nan   |     6950.5 |        4.99 |      42 |
| Mistral-7B-Instruct-v0.3                                                                 | nan   |    19631.4 |        4.89 |      42 |
| Phi-3-small-128k-instruct                                                                | nan   |    26747.5 |        3.52 |      42 |