dahara1 commited on
Commit
e1c6b2d
·
verified ·
1 Parent(s): be4db5b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -0
README.md CHANGED
@@ -644,6 +644,119 @@ Muzan: "Is there anything else you want to say?"
644
  Wakuraba: "This guy is going to be killed too. Everything depends on this guy's mood. I'm going to die too."
645
  ```
646
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
647
  ## 留意事項 Attention
648
 
649
  このアダプターをモデルとマージして保存すると性能が下がってしまう不具��が存在するため、**ベースモデル(unsloth/gemma-2-9b-it-bnb-4bit)とアダプターをマージして保存しないでください**
 
644
  Wakuraba: "This guy is going to be killed too. Everything depends on this guy's mood. I'm going to die too."
645
  ```
646
 
647
+ ## SpeedUp Sample
648
+
649
+ unslothを使う事で精度をわずかに犠牲にして実行速度を上げる事ができます。
650
+ Using unsloth can increase execution speed at the expense of a small amount of accuracy.
651
+
652
+ ```
653
+ pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
654
+ pip install transformers==4.43.3
655
+ pip install bitsandbytes==0.43.3
656
+ pip install accelerate==0.33.0
657
+ pip install peft==0.12.0
658
+ pip install flash-attn --no-build-isolation
659
+ pip install --upgrade pip
660
+ python -m pip install "unsloth[cu121-torch230] @ git+https://github.com/unslothai/unsloth.git"
661
+ pip install "unsloth[cu121-ampere-torch230] @ git+https://github.com/unslothai/unsloth.git"
662
+ ```
663
+
664
+ ```
665
+ import time
666
+ import torch
667
+
668
+ max_seq_length = 2048
669
+ load_in_4bit = True
670
+ dtype=torch.bfloat16
671
+ from unsloth import FastLanguageModel
672
+ adp_name = "webbigdata/C3TR-Adapter"
673
+ from transformers import TextStreamer
674
+
675
+ model_name = "unsloth/gemma-2-9b-it"
676
+
677
+ import os
678
+ os.environ["TOKENIZERS_PARALLELISM"] = "false"
679
+
680
+ model, tokenizer = FastLanguageModel.from_pretrained(
681
+ adp_name,
682
+ max_seq_length = max_seq_length,
683
+ dtype = dtype,
684
+ load_in_4bit = load_in_4bit,
685
+ )
686
+
687
+ FastLanguageModel.for_inference(model)
688
+
689
+ def trans(instruction, input):
690
+
691
+ system = """You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating."""
692
+ prompt = f"""{system}
693
+
694
+ <start_of_turn>### Instruction:
695
+ {instruction}
696
+
697
+ ### Input:
698
+ {input}
699
+ <end_of_turn>
700
+ <start_of_turn>### Response:
701
+ """
702
+
703
+ inputs = tokenizer(prompt, return_tensors="pt",
704
+ padding=True, max_length=2400, truncation=True).to("cuda")
705
+
706
+ from transformers import TextStreamer
707
+ class CountingStreamer(TextStreamer):
708
+ def __init__(self, tokenizer):
709
+ super().__init__(tokenizer)
710
+ self.tokenizer = tokenizer
711
+ self.token_count = 0
712
+
713
+ def put(self, text):
714
+ self.token_count += len(self.tokenizer.encode(text, add_special_tokens=False))
715
+ super().put(text)
716
+
717
+ def put(self, text):
718
+ if isinstance(text, torch.Tensor):
719
+ self.token_count += text.shape[-1]
720
+ elif isinstance(text, list):
721
+ self.token_count += len(text)
722
+ elif isinstance(text, str):
723
+ self.token_count += len(self.tokenizer.encode(text, add_special_tokens=False))
724
+ else:
725
+ raise TypeError(f"Unexpected type for text: {type(text)}")
726
+ super().put(text)
727
+
728
+ counting_streamer = CountingStreamer(tokenizer)
729
+ start_time = time.time()
730
+
731
+ _ = model.generate(**inputs, streamer = counting_streamer, max_new_tokens=2400,
732
+ #min_length=1000,
733
+ early_stopping=False)
734
+
735
+ end_time = time.time()
736
+
737
+ elapsed_time = end_time - start_time
738
+
739
+ generated_tokens = counting_streamer.token_count
740
+
741
+ tokens_per_second = generated_tokens / elapsed_time
742
+
743
+ print(f"generated_tokens: {generated_tokens}")
744
+ print(f"elapsed_time: {elapsed_time}")
745
+
746
+ tokens_per_second = generated_tokens / elapsed_time if elapsed_time > 0 else 0
747
+ print(f"トークン生成速度: {tokens_per_second:.2f} トークン/秒")
748
+ return tokens_per_second
749
+
750
+
751
+ tokens_per_second = trans("Translate English to Japanese.\nWhen translating, please use the following hints:\n[writing_style: journalistic]",
752
+ """Tech war: China narrows AI gap with US despite chip restrictions
753
+
754
+ China is narrowing the artificial intelligence (AI) gap with the US through rapid progress in deploying applications and state-backed adoption of the technology, despite the lack of access to advanced chips, according to industry experts and analysts.
755
+ """)
756
+
757
+ ```
758
+
759
+
760
  ## 留意事項 Attention
761
 
762
  このアダプターをモデルとマージして保存すると性能が下がってしまう不具��が存在するため、**ベースモデル(unsloth/gemma-2-9b-it-bnb-4bit)とアダプターをマージして保存しないでください**