hellonlp
/

promcse-bert-large-zh

Sentence Similarity

Inference Endpoints

Model card Files Files and versions Community

hellonlp commited on Jan 29, 2024

Commit

a1af357

·

verified ·

1 Parent(s): 640de1a

Update README.md

Files changed (1) hide show

README.md +43 -0

README.md CHANGED Viewed

@@ -1,3 +1,46 @@
 ---
 license: mit
 ---

 ---
 license: mit
+language:
+- zh
+pipeline_tag: sentence-similarity
 ---
+# PromCSE(sup)
+## Data List
+The following datasets are all in Chinese.
+|          Data          | size(train) | size(valid) | size(test) |
+|:----------------------:|:----------:|:----------:|:----------:|
+|   [STS-B](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/10yfKfTtcmLQ70-jzHIln1A%3Fpwd%3Dgf8y)  |   5231|   1458|   1361|
+|   [ATEC](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1gmnyz9emqOXwaHhSM9CCUA%3Fpwd%3Db17c)   |  62477|  20000|  20000|
+|   [BQ](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1M-e01yyy5NacVPrph9fbaQ%3Fpwd%3Dtis9)     | 100000|  10000|  10000|
+|   [LCQMC](https://pan.baidu.com/s/16DfE7fHrCkk4e8a2j3SYUg?pwd=bc8w )                                      | 238766|   8802|  12500|
+|   [PAWSX](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1ox0tJY3ZNbevHDeAqDBOPQ%3Fpwd%3Dmgjn)  |  49401|   2000|   2000|
+|   [SNLI](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1NOgA7JwWghiauwGAUvcm7w%3Fpwd%3Ds75v)   | 146828|   2699|   2618|
+|   [MNLI](https://link.zhihu.com/?target=https%3A//pan.baidu.com/s/1xjZKtWk3MAbJ6HX4pvXJ-A%3Fpwd%3D2kte)   | 122547|   2932|   2397|
+## Model List
+The evaluation dataset is in Chinese, and we used the same language model **RoBERTa Large** on different methods.  In addition, considering that the test set of some datasets is small, which may lead to a large deviation in evaluation accuracy, the evaluation data here uses train, valid and test at the same time, and the final evaluation result adopts the **weighted average (w-avg)** method.
+|          Model          | STS-B(w-avg) | ATEC | BQ | LCQMC | PAWSX | Avg. |
+|:-----------------------:|:------------:|:-----------:|:----------|:----------|:----------:|:----------:|
+|  [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh)  |  78.61| -| -| -| -| -|
+|  [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5)  |  79.07| -| -| -| -| -|
+|  [hellonlp/simcse-large-zh](https://huggingface.co/hellonlp/simcse-roberta-large-zh)  |  81.32| -| -| -| -| -|
+|  [hellonlp/promcse-large-zh](https://huggingface.co/hellonlp/promcse-bert-large-zh)  |  xx| -| -| -| -| -|