Update README.md
Browse files
README.md
CHANGED
@@ -8,31 +8,34 @@ language:
|
|
8 |
pipeline_tag: translation
|
9 |
library_name: transformers
|
10 |
---
|
11 |
-
# **NanoTranslator-
|
12 |
|
13 |
English | [简体中文](README_zh-CN.md)
|
14 |
|
15 |
## Introduction
|
16 |
|
17 |
-
This is the
|
18 |
|
19 |
The ONNX version of the model is also available in the repository.
|
20 |
|
21 |
|
22 |
-
| Size |
|
23 |
-
| :--: | :-----: | :--: | :--: | :--: |
|
24 |
-
| XL |
|
25 |
-
| L |
|
26 |
-
|
|
27 |
-
|
|
|
|
|
|
28 |
|
|
|
29 |
- **V.** - vocab size
|
30 |
- **H.** - hidden size
|
31 |
- **I.** - intermediate size
|
32 |
- **L.** - num layers
|
33 |
-
- **
|
34 |
-
- **
|
35 |
-
- **Tie
|
36 |
|
37 |
|
38 |
|
@@ -50,7 +53,7 @@ Prompt format as follows:
|
|
50 |
import torch
|
51 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
52 |
|
53 |
-
model_path = 'Mxode/NanoTranslator-
|
54 |
|
55 |
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
56 |
model = AutoModelForCausalLM.from_pretrained(model_path)
|
@@ -87,7 +90,7 @@ print(response)
|
|
87 |
|
88 |
It has been measured that reasoning with ONNX models will be **2-10 times faster** than reasoning directly with transformers models.
|
89 |
|
90 |
-
You should switch to [onnx branch](https://huggingface.co/Mxode/NanoTranslator-
|
91 |
|
92 |
reference docs:
|
93 |
|
|
|
8 |
pipeline_tag: translation
|
9 |
library_name: transformers
|
10 |
---
|
11 |
+
# **NanoTranslator-XS**
|
12 |
|
13 |
English | [简体中文](README_zh-CN.md)
|
14 |
|
15 |
## Introduction
|
16 |
|
17 |
+
This is the **small** model of the NanoTranslator, currently supported only in **English to Chinese**.
|
18 |
|
19 |
The ONNX version of the model is also available in the repository.
|
20 |
|
21 |
|
22 |
+
| Size | P. | Arch. | Act. | V. | H. | I. | L. | A.H. | K.H. | Tie |
|
23 |
+
| :--: | :-----: | :--: | :--: | :--: | :-----: | :---: | :------: | ---- | ---- | :--: |
|
24 |
+
| XL | 100 | LLaMA | SwiGLU | 16000 | 768 | 4096 | 8 | 24 | 8 | True |
|
25 |
+
| L | 78 | LLaMA | GeGLU | 16000 | 768 | 4096 | 6 | 24 | 8 | True |
|
26 |
+
| M2 | 22 | Qwen2 | GeGLU | 4000 | 432 | 2304 | 6 | 24 | 8 | True |
|
27 |
+
| M | 22 | LLaMA | SwiGLU | 8000 | 256 | 1408 | 16 | 16 | 4 | True |
|
28 |
+
| S | 9 | LLaMA | SwiGLU | 4000 | 168 | 896 | 16 | 12 | 4 | True |
|
29 |
+
| XS | 2 | LLaMA | SwiGLU | 2000 | 96 | 512 | 12 | 12 | 4 | True |
|
30 |
|
31 |
+
- **P.** - Parameters (in million)
|
32 |
- **V.** - vocab size
|
33 |
- **H.** - hidden size
|
34 |
- **I.** - intermediate size
|
35 |
- **L.** - num layers
|
36 |
+
- **A.H.** - num attention heads
|
37 |
+
- **K.H.** - num kv heads
|
38 |
+
- **Tie** - tie word embeddings
|
39 |
|
40 |
|
41 |
|
|
|
53 |
import torch
|
54 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
55 |
|
56 |
+
model_path = 'Mxode/NanoTranslator-XS'
|
57 |
|
58 |
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
59 |
model = AutoModelForCausalLM.from_pretrained(model_path)
|
|
|
90 |
|
91 |
It has been measured that reasoning with ONNX models will be **2-10 times faster** than reasoning directly with transformers models.
|
92 |
|
93 |
+
You should switch to [onnx branch](https://huggingface.co/Mxode/NanoTranslator-XS/tree/onnx) manually and download to local.
|
94 |
|
95 |
reference docs:
|
96 |
|