language:
- en
- ko
pipeline_tag: text-generation
komt : korean multi task instruction tuning model
Recently, due to the success of ChatGPT, numerous large language models have emerged in an attempt to catch up with ChatGPT's capabilities. However, when it comes to Korean language performance, it has been observed that many models still struggle to provide accurate answers or generate Korean text effectively. This study addresses these challenges by introducing a multi-task instruction technique that leverages supervised datasets from various tasks to create training data for Large Language Models (LLMs).
Model Details
- Model Developers : davidkim(changyeon kim)
- Repository : https://github.com/davidkim205/komt
- quant methods : q4_0, q4_1, q5_0, q5_1, q2_k, q3_k, q3_k_m, q3_k_l, q4_k, q4_k_s, q4_k_m, q5_k, q5_k_s, q5_k_m, q8_0, q4_0
Training
Refer https://github.com/davidkim205/komt
usage
cd llama.cpp
make -j && ./main -m ./models/komt-mistral-7b-v1/ggml-model-q4_1.gguf -p "[INST]μΈμΌμ μ΄λ€ ν¨κ³Όκ° μλκ°μ? [/INST]"
output
[INST]μΈμΌμ μ΄λ€ ν¨κ³Όκ° μλκ°μ? [/INST]μΈμΌμλ μ½ 600κ°μ λ¬Όμ§μ΄ ν¬ν¨λμ΄ μμ΅λλ€. κ·Έ μ€μμλ λ§μ μ°κ΅¬λ€μ ν΅ν΄ ν¨λ₯μ νμΈν κ²μΌλ‘ μλ €μ§ 3κ°μ§λ μλμ κ°μ΅λλ€.
1. νμ μ‘°μ : κ°μ’
μ€νμμ μΈμΌμ΄ νμμ μ‘°μ νλλ° ν¨κ³Όλ₯Ό λνλμΌλ©°, νΉν μ€κ΅μ ν μ°κ΅¬μλ€μ μΈμΌμ μ¬μ©ν΄ 40%μ νμ κ°μλ₯Ό 보μμ΅λλ€.
2. μμ₯ κ°μ : μΈμΌμ νν°, ν΅μ¦ λ±μΌλ‘ κ³ ν΅λ°λ μμ₯ μ§νμ μΌλΆλλ§ κ°μ ν μ μλλ°, μ΄λ κ°μ’
μ€νλ€μμ νμΈλ κ²μ
λλ€.
3. λ©΄μ κ°ν: μΈμΌμ λ©΄μ체κ³λ₯Ό κ°νμν€λλ° ν¨κ³Όκ° μμΌλ©°, κ΅λ΄μμλ 2014λ
λΆν°λ μμ½μ²μ μμ½μ©ν μμΆμ¦λͺ
μ μ λν μ΅μ’
μ μΈ νκ°λ‘ μ¬μ©λκ³ μμ΅λλ€.
μμ κ°μ ν¨λ₯μ κ°μΆ μΈμΌμ λ§μ΄ μ¬μ©νλ 건κ°μνμ μλ£λ‘λ νμ©λ©λλ€. [end of text]
Evaluation
For objective model evaluation, we initially used EleutherAI's lm-evaluation-harness but obtained unsatisfactory results. Consequently, we conducted evaluations using ChatGPT, a widely used model, as described in Self-Alignment with Instruction Backtranslation and Three Ways of Using Large Language Models to Evaluate Chat .
model | score | average(0~5) | percentage |
---|---|---|---|
gpt-3.5-turbo(close) | 147 | 3.97 | 79.45% |
naver Cue(close) | 140 | 3.78 | 75.67% |
clova X(close) | 136 | 3.67 | 73.51% |
WizardLM-13B-V1.2(open) | 96 | 2.59 | 51.89% |
Llama-2-7b-chat-hf(open) | 67 | 1.81 | 36.21% |
Llama-2-13b-chat-hf(open) | 73 | 1.91 | 38.37% |
nlpai-lab/kullm-polyglot-12.8b-v2(open) | 70 | 1.89 | 37.83% |
kfkas/Llama-2-ko-7b-Chat(open) | 96 | 2.59 | 51.89% |
beomi/KoAlpaca-Polyglot-12.8B(open) | 100 | 2.70 | 54.05% |
komt-llama2-7b-v1 (open)(ours) | 117 | 3.16 | 63.24% |
komt-llama2-13b-v1 (open)(ours) | 129 | 3.48 | 69.72% |
komt-llama-30b-v1 (open)(ours) | 129 | 3.16 | 63.24% |
komt-mistral-7b-v1 (open)(ours) | 131 | 3.54 | 70.81% |