ikala
/

bloom-zh-3b-chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ikala-ray commited on May 25, 2023

Commit

cc293e4

·

1 Parent(s): 587bc22

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -66,7 +66,7 @@ start generating the assistant reply.
 ## Dev Details
-- base model: [togethercomputer/RedPajama-INCITE-Base-3B-v1](https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-3B-v1)
 - checkpoint: 1 epoch (6000 steps)
 command: `deepspeed trainer_sft.py --configs defaults stablelm-7b oasst-mix --cache_dir /home/ubuntu/data_cache --output_dir .saved/stable-lm-7b-1 --num_train_epochs 4 --deepspeed`
@@ -111,7 +111,7 @@ bloom-zh-3b:
   max_length: 5120
   warmup_steps: 2000
   gradient_checkpointing: true
-  gradient_accumulation_steps: 30
   per_device_train_batch_size: 1
   per_device_eval_batch_size: 1
   eval_steps: 500

 ## Dev Details
+- base model: [ckip-joint/bloom-3b-zh](https://huggingface.co/ckip-joint/bloom-3b-zh)
 - checkpoint: 1 epoch (6000 steps)
 command: `deepspeed trainer_sft.py --configs defaults stablelm-7b oasst-mix --cache_dir /home/ubuntu/data_cache --output_dir .saved/stable-lm-7b-1 --num_train_epochs 4 --deepspeed`
   max_length: 5120
   warmup_steps: 2000
   gradient_checkpointing: true
+  gradient_accumulation_steps: 32
   per_device_train_batch_size: 1
   per_device_eval_batch_size: 1
   eval_steps: 500