HachiML
/

myBit-Llama2-jp-127M-4

Text Generation

Generated from Trainer

Model card Files Files and versions Community

HachiML commited on Mar 24, 2024

Commit

9e18fc7

·

verified ·

1 Parent(s): fc8edc9

Update README.md

Files changed (1) hide show

README.md +44 -4

README.md CHANGED Viewed

@@ -11,13 +11,53 @@ should probably proofread and complete it, then remove this comment. -->
 # myBit-Llama2-jp-127M-4
-This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
-It achieves the following results on the evaluation set:
 - Loss: 2.9790
 ## Model description
-More information needed
 ## Intended uses & limitations
@@ -25,7 +65,7 @@ More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure

 # myBit-Llama2-jp-127M-4
+This model has 127M parameters.
+The model is a pre-trained Bit-Llama2 of Parameters with only 1 epoch on a Japanese dataset.
+The dataset used is [range3/wiki40b-ja](https://huggingface.co/datasets/range3/wiki40b-ja).
 - Loss: 2.9790
 ## Model description
+Github：　[BitNet-b158](https://github.com/Hajime-Y/BitNet-b158)
+More information about this model can be found in the following pages:
+[BitNet&BitNet b158の実装](https://note.com/hatti8/n/nc6890e79a19a)
+## How to use
+1. install the library
+```
+!pip install mybitnet
+!pip install -U accelerate transformers==4.38.2
+!pip install torch
+```
+2. get model
+```
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_name = "HachiML/myBit-Llama2-jp-127M-4"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True)
+print(model)
+```
+3. inference
+```
+prompt = "昔々あるところに、"
+input_ids = tokenizer.encode(
+    prompt,
+    return_tensors="pt"
+)
+tokens = model.generate(
+    input_ids.to(device=model.device),
+    max_new_tokens=128,
+)
+out = tokenizer.decode(tokens[0], skip_special_tokens=True)
+print(out)
+```
 ## Intended uses & limitations
 ## Training and evaluation data
+ - [range3/wiki40b-ja](https://huggingface.co/datasets/range3/wiki40b-ja)
 ## Training procedure