--- library_name: transformers license: llama3 language: - ko - en pipeline_tag: text-generation --- # davidkim205/ko-gemma-2-9b-it davidkim205/ko-gemma-2-9b-it is one of several models being researched to improve the performance of Korean language models. (would be released soon) ## Model Details * **Model Developers** : davidkim(changyeon kim) * **Repository** : - * **base mode** : google/gemma-2-9b-it * **sft dataset** : qa_ability_1851.jsonl ## Benchmark ### kollm_evaluation https://github.com/davidkim205/kollm_evaluation | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr| |-------------------|-------|------|-----:|--------|-----:|---|------| |kobest |N/A |none | 0|acc |0.5150|± |0.0073| | | |none | 0|f1 |0.4494|± |N/A | | - kobest_boolq | 1|none | 0|acc |0.6154|± |0.0130| | | |none | 0|f1 |0.5595|± |N/A | | - kobest_copa | 1|none | 0|acc |0.4710|± |0.0158| | | |none | 0|f1 |0.4700|± |N/A | | - kobest_hellaswag| 1|none | 0|acc |0.3880|± |0.0218| | | |none | 0|f1 |0.3832|± |N/A | | | |none | 0|acc_norm|0.4780|± |0.0224| | - kobest_sentineg | 1|none | 0|acc |0.5189|± |0.0251| | | |none | 0|f1 |0.4773|± |N/A | | - kobest_wic | 1|none | 0|acc |0.4873|± |0.0141| | | |none | 0|f1 |0.3276|± |N/A | |ko_truthfulqa | 2|none | 0|acc |0.3390|± |0.0166| |ko_mmlu | 1|none | 0|acc |0.1469|± |0.0019| | | |none | 0|acc_norm|0.1469|± |0.0019| |ko_hellaswag | 1|none | 0|acc |0.2955|± |0.0046| | | |none | 0|acc_norm|0.3535|± |0.0048| |ko_common_gen | 1|none | 0|acc |0.5825|± |0.0126| | | |none | 0|acc_norm|0.5825|± |0.0126| |ko_arc_easy | 1|none | 0|acc |0.2329|± |0.0124| | | |none | 0|acc_norm|0.2867|± |0.0132| ### Evaluation of KEval keval is an evaluation model that learned the prompt and dataset used in the benchmark for evaluating Korean language models among various methods of evaluating models with chatgpt to compensate for the shortcomings of the existing lm-evaluation-harness. https://huggingface.co/davidkim205/keval-7b | model | ned | exe_time | evalscore | count | |:-----------------------------------------------------------------------------------------|------:|-----------:|------------:|--------:| | claude-3-opus-20240229 | nan | nan | 8.79 | 42 | | gpt-4-turbo-2024-04-09 | nan | nan | 8.71 | 42 | | Qwen2-72B-Instruct | nan | 29850.5 | 7.85 | 42 | | WizardLM-2-8x22B | nan | 133831 | 7.57 | 42 | | ***ko-gemma-2-9b-it*** | nan | 30789.5 | 7.52 | 42 | | HyperClovaX | nan | nan | 7.44 | 42 | | gemma-2-9b-it | nan | 23531.7 | 7.4 | 42 | | glm-4-9b-chat | nan | 24825.6 | 7.31 | 42 | | Ko-Llama-3-8B-Instruct | nan | 10697.5 | 6.81 | 42 | | Qwen2-7B-Instruct | nan | 11856.3 | 6.02 | 42 | | Not-WizardLM-2-7B | nan | 12955.7 | 5.26 | 42 | | gemma-1.1-7b-it | nan | 6950.5 | 4.99 | 42 | | Mistral-7B-Instruct-v0.3 | nan | 19631.4 | 4.89 | 42 | | Phi-3-small-128k-instruct | nan | 26747.5 | 3.52 | 42 |