Mxode
/

NanoTranslator-XS

@@ -8,31 +8,34 @@ language:
 pipeline_tag: translation
 library_name: transformers
 ---
-# **NanoTranslator-S**
 English | [简体中文](README_zh-CN.md)
 ## Introduction
-This is the Small model of the NanoTranslator, currently supported only in **English to Chinese**.
 The ONNX version of the model is also available in the repository.
-| Size | Params. |  V.  |  H.  |  I.  |  L.  | Att. H. | KV H. | Tie Emb. |
-| :--: | :-----: | :--: | :--: | :--: | :--: | :-----: | :---: | :------: |
-|  XL  |  50 M   | 8000 | 320  | 1792 |  24  |   16    |   4   |   True   |
-|  L   |  22 M   | 8000 | 256  | 1408 |  16  |   16    |   4   |   True   |
-|  M   |  9 M  | 4000 | 168 | 896 |  16  |   12    |   4   |   True   |
-|  S   | 2 M  | 2000 |  96  | 512  |  12  |   12    |   4   |   True   |
 - **V.** - vocab size
 - **H.** - hidden size
 - **I.** - intermediate size
 - **L.** - num layers
-- **Att. H.** - num attention heads
-- **KV H.** - num kv heads
-- **Tie Emb.** - tie word embeddings
@@ -50,7 +53,7 @@ Prompt format as follows：
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
-model_path = 'Mxode/NanoTranslator-S'
 tokenizer = AutoTokenizer.from_pretrained(model_path)
 model = AutoModelForCausalLM.from_pretrained(model_path)
@@ -87,7 +90,7 @@ print(response)
 It has been measured that reasoning with ONNX models will be **2-10 times faster** than reasoning directly with transformers models.
-You should switch to [onnx branch](https://huggingface.co/Mxode/NanoTranslator-S/tree/onnx) manually and download to local.
 reference docs:

 pipeline_tag: translation
 library_name: transformers
 ---
+# **NanoTranslator-XS**
 English | [简体中文](README_zh-CN.md)
 ## Introduction
+This is the **small** model of the NanoTranslator, currently supported only in **English to Chinese**.
 The ONNX version of the model is also available in the repository.
+| Size | P. | Arch. | Act. |  V.  |  H.  |  I.  |  L.  | A.H. | K.H. | Tie |
+| :--: | :-----: | :--: | :--: | :--: | :-----: | :---: | :------: | ---- | ---- | :--: |
+|  XL  |  100  |  LLaMA  |  SwiGLU  | 16000 | 768  | 4096 |  8   |  24  |  8   | True |
+|  L   |  78  | LLaMA | GeGLU  | 16000 | 768  | 4096 |  6   |  24  |  8   | True |
+| M2 | 22 | Qwen2 | GeGLU | 4000  | 432  | 2304 |  6   |  24  |  8   | True |
+|  M   |  22  |  LLaMA  |  SwiGLU  | 8000  | 256  | 1408 |  16  |  16  |  4   | True |
+|  S   | 9 | LLaMA | SwiGLU | 4000  | 168  | 896  |  16  |  12  |  4   | True |
+| XS | 2 | LLaMA | SwiGLU | 2000 | 96 | 512 | 12 | 12 | 4 | True |
+- **P.** - Parameters (in million)
 - **V.** - vocab size
 - **H.** - hidden size
 - **I.** - intermediate size
 - **L.** - num layers
+- **A.H.** - num attention heads
+- **K.H.** - num kv heads
+- **Tie** - tie word embeddings
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
+model_path = 'Mxode/NanoTranslator-XS'
 tokenizer = AutoTokenizer.from_pretrained(model_path)
 model = AutoModelForCausalLM.from_pretrained(model_path)
 It has been measured that reasoning with ONNX models will be **2-10 times faster** than reasoning directly with transformers models.
+You should switch to [onnx branch](https://huggingface.co/Mxode/NanoTranslator-XS/tree/onnx) manually and download to local.
 reference docs: