shibing624 commited on
Commit
f42a51d
·
verified ·
1 Parent(s): 2cb063b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md CHANGED
@@ -121,6 +121,70 @@ print("Sentence embeddings:")
121
  print(sentence_embeddings)
122
  ```
123
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
124
 
125
  ## Full Model Architecture
126
  ```
 
121
  print(sentence_embeddings)
122
  ```
123
 
124
+ ## Model speed up
125
+
126
+
127
+ | Model | ATEC | BQ | LCQMC | PAWSX | STSB |
128
+ |------------------------------------------------------------------------------------------------------------------------------|-------------------|-------------------|------------------|------------------|------------------|
129
+ | shibing624/text2vec-base-chinese (fp32, baseline) | 0.31928 | 0.42672 | 0.70157 | 0.17214 | 0.79296 |
130
+ | shibing624/text2vec-base-chinese (onnx-O4, [#29](https://huggingface.co/shibing624/text2vec-base-chinese/discussions/29)) | 0.31928 | 0.42672 | 0.70157 | 0.17214 | 0.79296 |
131
+ | shibing624/text2vec-base-chinese (ov, [#27](https://huggingface.co/shibing624/text2vec-base-chinese/discussions/27)) | 0.31928 | 0.42672 | 0.70157 | 0.17214 | 0.79296 |
132
+ | shibing624/text2vec-base-chinese (ov-qint8, [#30](https://huggingface.co/shibing624/text2vec-base-chinese/discussions/30)) | 0.30778 (-3.60%) | 0.43474 (+1.88%) | 0.69620 (-0.77%) | 0.16662 (-3.20%) | 0.79396 (+0.13%) |
133
+
134
+ In short:
135
+ 1. ✅ shibing624/text2vec-base-chinese (onnx-O4), ONNX Optimized to [O4](https://huggingface.co/docs/optimum/en/onnxruntime/usage_guides/optimization) does not reduce performance, but gives a [~2x speedup](https://sbert.net/docs/sentence_transformer/usage/efficiency.html#benchmarks) on GPU.
136
+ 2. ✅ shibing624/text2vec-base-chinese (ov), OpenVINO does not reduce performance, but gives a 1.12x speedup on CPU.
137
+ 3. 🟡 shibing624/text2vec-base-chinese (ov-qint8), int8 quantization with OV incurs a small performance hit on some tasks, and a tiny performance gain on others, when quantizing with [Chinese STSB](https://huggingface.co/datasets/PhilipMay/stsb_multi_mt). Additionally, it results in a [4.78x speedup](https://sbert.net/docs/sentence_transformer/usage/efficiency.html#benchmarks) on CPU.
138
+
139
+ - usage: shibing624/text2vec-base-chinese (onnx-O4), for gpu
140
+ ```
141
+ from sentence_transformers import SentenceTransformer
142
+
143
+ model = SentenceTransformer(
144
+ "shibing624/text2vec-base-chinese",
145
+ backend="onnx",
146
+ model_kwargs={"file_name": "model_O4.onnx"},
147
+ )
148
+ embeddings = model.encode(["怎么开通银行卡", "如何更换花呗绑定银行卡", "花呗更改绑定银行卡"])
149
+ print(embeddings.shape)
150
+
151
+ similarities = model.similarity(embeddings, embeddings)
152
+ print(similarities)
153
+ ```
154
+
155
+
156
+ - usage: shibing624/text2vec-base-chinese (ov), for cpu
157
+ ```
158
+ from sentence_transformers import SentenceTransformer
159
+
160
+ model = SentenceTransformer(
161
+ "shibing624/text2vec-base-chinese",
162
+ backend="openvino",
163
+ )
164
+
165
+ embeddings = model.encode(["怎么开通银行卡", "如何更换花呗绑定银行卡", "花呗更改绑定银行卡"])
166
+ print(embeddings.shape)
167
+
168
+ similarities = model.similarity(embeddings, embeddings)
169
+ print(similarities)
170
+ ```
171
+
172
+ - usage: shibing624/text2vec-base-chinese (ov-qint8), for cpu
173
+ ```
174
+ from sentence_transformers import SentenceTransformer
175
+
176
+ model = SentenceTransformer(
177
+ "shibing624/text2vec-base-chinese",
178
+ backend="onnx",
179
+ model_kwargs={"file_name": "model_qint8_avx512_vnni.onnx"},
180
+ )
181
+ embeddings = model.encode(["怎么开通银行卡", "如何更换花呗绑定银行卡", "花呗更改绑定银行卡"])
182
+ print(embeddings.shape)
183
+
184
+ similarities = model.similarity(embeddings, embeddings)
185
+ print(similarities)
186
+ ```
187
+
188
 
189
  ## Full Model Architecture
190
  ```