kazuHF
/

llm-jp-3-13b-it2_lora

@@ -22,7 +22,48 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
-# 推論方法
 ◆ Google Colaboratory上のL4での推論
@@ -53,7 +94,7 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
 14. 推論結果をjsonlで保存。
-# ◆ 事後学習
 ◆ Hugging Faceでwrite権限のあるtokenの取得

 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+# kazuHF/llm-jp-3-13b-it2_loraの概要
+1. モデル概要
+- ベースモデル: llm-jp/llm-jp-3-13b (https://huggingface.co/llm-jp/llm-jp-3-13b)
+- 用途: 日本語によるQ&A形式の文章生成
+- アーキテクチャ: FrameworkやLibraryとしてはPyTorch、Transformers、Unsloth、trl、LoRA、xformers、Flash Attentionなどを利用。UnslothはFine TuningやInferenceを高速化しメモリも削減する。llm-jp/llm-jp-3-13bを4bitで量子化するLoRAでロードし、SFTで事後学習を行った。
+2. 事後学習の詳細
+- 事後学習用データにichikara-instruction-003-001-1.jsonを必要な申請を行って利用した。
+- Epoch数 1, バッチサイズ 2, 学習率 2e-4
+- Google Colaboratory Pro上のL4/A100で学習
+3. モデルの入出力
+- 学習における入力のkeyは “text”、出力のkeyは “output”
+- 推論による出力のkeyは “task_id”, “input”, “output”
+4. 推論方法
+- Hugging FaceのIDとして、
+　model_id = "llm-jp/llm-jp-3-13b”, adapter_id = "kazuHF/llm-jp-3-13b-it2_lora"
+と指定し、
+　FastLanguageModel.from_pretrained( … model_id … )
+で元のモデルをロードする。そして
+　model = PeftModel.from_pretrained( … adaptor_id … )
+によって元のモデルとLoRAのアダプターを結合し、そのモデルのモードを
+　FastLanguageModel.for_inference(model)
+によって推論モードに変更する。
+入力を”””###\n 指示 入力 \n### 回答\n”””の形式にしてトークン化し、
+　model.generate( “input_ids”: …, “attention_mask”: …, …)
+によってpredictionを行い、それをdecodeして出力とする。
+5. ライセンス
+- ベースモデル: 国立情報学研究所 大規模言語モデル研究開発センターが公開しているllm-jp/llm-jp-3-13b。Apache 2.0 のライセンスを継承する。
+- 事後学習に用いたデータ: 理化学研究所 革新知能統合研究センター 言語情報アクセス技術チームが公開している ichikara-instruction-003-001-1.json 。CC-BY-NC-SAのライセンスを継承する。
+6. 問題点や改善点
+- promptに対して適切に答える場合もあるが、回答が短かったり、答えられない場合も散見されるため、更なる学習データの蓄積と事後学習を要する。
+7. 謝辞
+- 東京大学 松尾・岩澤研究室主催の大規模言語モデルDeep Learning応用講座 2024|Fall を受講することで本モデルが作製できた。同講座に関係する方々並びに同講座を受講された方々に深謝する。
+# 推論方法の詳細
 ◆ Google Colaboratory上のL4での推論
 14. 推論結果をjsonlで保存。
+# ◆ 事後学習の詳細
 ◆ Hugging Faceでwrite権限のあるtokenの取得