jaeyong2's picture
Upload Qwen2ForCausalLM
0388df8 verified
|
raw
history blame
1.96 kB
metadata
base_model:
  - Qwen/Qwen2.5-1.5B-Instruct
language:
  - ja
  - en
library_name: transformers

Model Card for Model ID

Evaluation

llm-jp-eval script(colab)

!git clone https://github.com/llm-jp/llm-jp-eval.git
!cd llm-jp-eval && pip install -e .
!cd llm-jp-eval && python scripts/preprocess_dataset.py --dataset-name all --output-dir ./dataset_dir
!cd llm-jp-eval && python scripts/evaluate_llm.py -cn config.yaml model.pretrained_model_name_or_path=jaeyong2/Qwen2.5-1.5B-Instruct-JaMagpie-Preview tokenizer.pretrained_model_name_or_path=jaeyong2/Qwen2.5-1.5B-Instruct-JaMagpie-Preview dataset_dir=./dataset_dir/1.4.1/evaluation/test
llm-jp-eval Qwen2.5-1.5B-Instruct google/gemma-2-2b-jpn-it finetuning-model
AVG 0.4343 0.4315 0.4501
CG 0.0600 0.0000 0.1100
EL 0.3952 0.3222 0.4197
FA 0.0690 0.0846 0.0542
HE 0.4400 0.4350 0.4600
MC 0.6800 0.6000 0.6433
MR 0.4700 0.4900 0.5900
MT 0.6137 0.7666 0.6394
NLI 0.5500 0.5260 0.5360
QA 0.2443 0.2813 0.2606
RC 0.8208 0.8097 0.7881