HachiML
/

youri-2x7b_v0.2

 ---
 license: llama2
 ---
+# youri-2x7b_v0.2
+This model is a Mixture of Experts (MoE) merger of the following two models:
+- [rinna/youri-7b-instruction](https://huggingface.co/rinna/youri-7b-instruction)
+- [rinna/youri-7b-chat](https://huggingface.co/rinna/youri-7b-chat)
+## 🧩 Configuration
+The model has been made with a custom version of the [mergekit](https://github.com/cg123/mergekit) library (mixtral branch) and the following configuration:
+```yaml
+base_model: rinna/youri-7b-chat
+gate_mode: hidden # one of "hidden", "cheap_embed", or "random"
+dtype: bfloat16 # output dtype (float32, float16, or bfloat16)
+experts:
+  - source_model: rinna/youri-7b-chat
+    positive_prompts:
+      - "質問と回答の選択肢を入力として受け取り、選択肢から回答を選択してください。"
+      - "前提と仮説の関係を含意、矛盾、中立の中から回答してください。"
+      - "以下のテキストを、ポジティブまたはネガティブの感情クラスのいずれかに分類してください。"
+      - "与えられた問題に対して、ステップごとに答えを導き出してください。"
+  - source_model: rinna/youri-7b-instruction
+    positive_prompts:
+     - "質問に対する回答を題名と文章から一言で抽出してください。回答は名詞で答えてください。"
+     - "与えられたニュース記事を要約してください。"
+     - "与えられた文が文法的であるかを回答してください。"
+```
+The `positive_prompts` in the above configuration are extracted from the instructions of benchmarks that each model excels in.
+For reference on the benchmarks for each model, please see the LM Benchmark at [rinna's LM Benchmark](https://rinnakk.github.io/research/benchmarks/lm/index.html).
+These benchmarks provide a detailed overview of the areas where each individual model performs particularly well, guiding the effective use of the merged model in various natural language processing tasks.
+## 💻 Usage
+Here's a [Colab notebook](https://colab.research.google.com/drive/1k6C_oJfEKUq0mtuWKisvoeMHxTcIxWRa?usp=sharing) to run Phixtral in 4-bit precision on a free T4 GPU.
+```python
+!pip install -q --upgrade transformers einops accelerate bitsandbytes
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "HachiML/youri-2x7b_v0.2"
+torch.set_default_device("cuda")
+# Load the model and tokenizer
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    load_in_4bit=True,
+    trust_remote_code=True
+)
+tokenizer = AutoTokenizer.from_pretrained(
+    model_name,
+    trust_remote_code=True
+)
+torch.set_default_device("cuda")
+# Create input
+instruction = "次の日本語を英語に翻訳してください。"
+input = "大規模言語モデル（だいきぼげんごモデル、英: large language model、LLM）は、多数のパラメータ（数千万から数十億）を持つ人工ニューラルネットワークで構成されるコンピュータ言語モデルで、膨大なラベルなしテキストを使用して自己教師あり学習または半教師あり学習によって訓練が行われる。"
+prompt = f"""
+以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。
+### 指示:
+{instruction}
+### 入力:
+{input}
+### 応答:
+"""
+# Tokenize the input string
+token_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
+# Generate text using the model
+with torch.no_grad():
+    output_ids = model.generate(
+        token_ids.to(model.device),
+        max_new_tokens=200,
+        do_sample=True,
+        temperature=0.5,
+        pad_token_id=tokenizer.pad_token_id,
+        bos_token_id=tokenizer.bos_token_id,
+        eos_token_id=tokenizer.eos_token_id
+    )
+# Decode and print the output
+output = tokenizer.decode(output_ids.tolist()[0])
+print(output)
+```