|
--- |
|
base_model: llm-jp/llm-jp-3-13b |
|
library_name: transformers |
|
language: |
|
- ja |
|
--- |
|
|
|
# Model Card for Model ID (to be completed) |
|
|
|
This model is developed as the completion requirement of the Matsuo Lab LLM2024 course. |
|
|
|
|
|
<!-- ## Model Details --> |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
<!-- - **Developed by:** [More Information Needed] --> |
|
<!-- - **Funded by [optional]:** [More Information Needed] --> |
|
<!-- - **Shared by [optional]:** [More Information Needed] --> |
|
<!-- - **Model type:** [More Information Needed] --> |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** llm-jp-3-13b |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
<!-- This model --> |
|
|
|
<!-- - **Repository:** [More Information Needed] --> |
|
<!-- - **Paper [optional]:** [More Information Needed] --> |
|
<!-- - **Demo [optional]:** [More Information Needed] --> --> |
|
|
|
<!-- ## Uses --> |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
<!-- ### Direct Use --> |
|
|
|
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> |
|
|
|
<!-- [More Information Needed] --> |
|
|
|
<!-- ### Downstream Use [optional] --> |
|
|
|
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app --> |
|
|
|
<!-- [More Information Needed] --> |
|
|
|
<!-- ### Out-of-Scope Use --> |
|
|
|
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> |
|
|
|
<!-- [More Information Needed] --> |
|
|
|
<!-- ## Bias, Risks, and Limitations --> |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
<!-- [More Information Needed] --> |
|
|
|
<!-- ### Recommendations --> |
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
<!-- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. --> |
|
|
|
<!-- ## How to Get Started with the Model --> |
|
|
|
<!-- Use the code below to get started with the model. --> |
|
|
|
<!-- [More Information Needed] --> |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
- **Base Model:** llm-jp/llm-jp-3-13b |
|
- **Data for Instructoin Tuning:** ichikara- |
|
- **Data for DPO:** https://huggingface.co/datasets/elyza/ELYZA-tasks-100 |
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
### Training Procedure |
|
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
|
1. Fine-tune the base model with Instruction Tuning |
|
2. Perform DPO on the fine-tuned model with generated data |
|
- 3 similar prompts are generated for each sample prompt in the DPO data |
|
- The fine-tuned model is used to generate two answers for each of the prompt |
|
- Due to time limitation, first generated answer is to be labelled as the chosen answer |
|
|
|
|
|
#### Training Hyperparameters |
|
SFT for instruction tuning |
|
```python |
|
max_seq_length = 512 |
|
dtype = None |
|
load_in_4bit = True |
|
|
|
model_id = "llm-jp/llm-jp-3-13b" |
|
new_model_id = "llm-jp-3-13b-it" |
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name=model_id, |
|
dtype=dtype, |
|
load_in_4bit=load_in_4bit, |
|
trust_remote_code=True, |
|
device_map="auto", |
|
) |
|
|
|
model = FastLanguageModel.get_peft_model( |
|
model, |
|
r = 16, #32 |
|
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", |
|
"gate_proj", "up_proj", "down_proj",], |
|
lora_alpha = 32, |
|
lora_dropout = 0, #0.05 |
|
bias = "none", |
|
use_gradient_checkpointing = "unsloth", |
|
random_state = 3407, |
|
use_rslora = False, |
|
loftq_config = None, |
|
max_seq_length = max_seq_length, |
|
) |
|
``` |
|
Training Hyperparameters |
|
```python |
|
from trl import SFTTrainer |
|
from transformers import TrainingArguments |
|
from unsloth import is_bfloat16_supported |
|
|
|
trainer = SFTTrainer( |
|
model = model, |
|
tokenizer = tokenizer, |
|
train_dataset=dataset["train"], |
|
max_seq_length = max_seq_length, |
|
dataset_text_field="formatted_text", |
|
packing = False, |
|
args = TrainingArguments( |
|
per_device_train_batch_size = 2, |
|
gradient_accumulation_steps = 4, |
|
num_train_epochs = 1, |
|
logging_steps = 10, |
|
warmup_steps = 5, #10 |
|
save_steps=100, |
|
save_total_limit=2, |
|
max_steps= -1, |
|
learning_rate = 2e-4, |
|
fp16 = not is_bfloat16_supported(), |
|
bf16 = is_bfloat16_supported(), |
|
group_by_length=True, |
|
seed = 3407, |
|
output_dir = "outputs", |
|
report_to = "none", |
|
# additional settings |
|
optim = "adamw_8bit", |
|
weight_decay = 0.01 |
|
), |
|
) |
|
``` |
|
|
|
#### Experimental Trials |
|
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. --> |
|
**Instruction Tuning Only** (model x data) |
|
<br />(hyperparameter settings as commented) |
|
<br />01 - llm-jp-3-13b x ichikara-instruction-003-001-1.json (unmodified sample code provided) |
|
<br />02 - llm-jp-3-13b x ichikara-instruction-003-002-1.json |
|
<br />03 - Llama-3.1-8B-Instruct-bnb-4bit x ichikara-instruction-003-001-1.json |
|
<br />04 - Llama-3.2-8B-Instruct-bnb-4bit x ichikara-instruction-003-001-1.json |
|
<br />05 - gemma-2-9b-bnb-4bit x ichikara-instruction-003-001-1.json |
|
<br />09 - llm-jp-3-13b x kunishou/databricks-dolly-15k-ja |
|
|
|
(hyperparameter settings as non-commented) |
|
<br />00 - llm-jp-3-13b x ichikara-instruction-003-001-1.json |
|
<br />06 - gemma-2-9b-bnb-4bit x ichikara-instruction-003-001-1.json |
|
<br />07 - llm-jp-3-13b x ichikara-instruction-003-001-1.json |
|
<br />08 - llm-jp-3-13b x ichikara-instruction-003-001-1.json (with max_steps = 150) |
|
<br />10 - gemma-2-9b-bnb-4bit x kunishou/databricks-dolly-15k-ja |
|
|
|
**Instruction Tuning + DPO** |
|
<br />11 - 00 + DPO |
|
<br />12 - 06 + DPO |
|
|
|
[More Information Needed] |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
<!-- ### Testing Data, Factors & Metrics --> |
|
|
|
#### Testing Data |
|
|
|
<!-- This should link to a Dataset Card if possible. --> |
|
|
|
The final performance of the model is to be evaluated using the elyza-tasks-100-TV dataset |
|
|
|
|
|
#### Metrics |
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
The score below is given upon uploading the outputs to the course management system. |
|
|
|
### Results |
|
| Trial | Score | |
|
| ----- | ----- | |
|
| 00 | 3.04 | |
|
| 01 | 3.00 | |
|
| 02 | 2.71 | |
|
| 03 | 2.52 | |
|
| 04 | 2.40 | |
|
| 05 | 2.71 | |
|
| 06 | 2.72 | |
|
| 07 | 2.93 | |
|
| 08 | 2.87 | |
|
| 09 | 2.20 | |
|
| 10 | 2.40 | |
|
| 11 | 2.34 | |
|
| 12 | 2.28 | |
|
|
|
|
|
#### Summary |
|
This model is the result of the 11th attempt of the competition, with the score of 2.34 from the course evaluation system. |
|
|
|
|
|
### Model Architecture and Objective |
|
|
|
[More Information Needed] |
|
|
|
### Compute Infrastructure |
|
|
|
The model is trained using T4/L4/A100 GPUs on Google Colabotory |
|
|
|
|
|
<!-- ## Glossary [optional] --> |
|
|
|
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. --> |
|
|
|
<!-- [More Information Needed] --> |
|
|
|
<!-- ## More Information [optional] --> |
|
|
|
<!-- [More Information Needed] --> |
|
|
|
<!-- ## Model Card Authors [optional] --> |
|
|
|
<!-- [More Information Needed] --> |
|
|
|
<!-- ## Model Card Contact --> |
|
|
|
<!-- [More Information Needed] --> |