Raincleared commited on
Commit
867f2d4
·
verified ·
1 Parent(s): a35e6e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -10,12 +10,13 @@ license: apache-2.0
10
  ---
11
 
12
 
13
- # ProSparse-MiniCPM-1B-sft
14
 
15
  - Original model: [MiniCPM-1B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16)
16
  - Model creator and fine-tuned by: [ModelBest](https://modelbest.cn/), [OpenBMB](https://huggingface.co/openbmb), and [THUNLP](https://nlp.csai.tsinghua.edu.cn/)
17
  - Paper: [link](https://arxiv.org/pdf/2402.13516.pdf)
18
- - Adapted LLaMA version: [ProSparse-MiniCPM-1B-sft-llama-format](https://huggingface.co/openbmb/ProSparse-MiniCPM-1B-sft-llama-format/)
 
19
 
20
  ### Introduction
21
 
@@ -76,10 +77,10 @@ The evaluation results on the above benchmarks demonstrate the advantage of ProS
76
  | **ProSparse-13B**\* | 87.97 | **45.07** | 29.03 | 69.75 | 67.54 | 25.40 | 54.78 | 40.20 | 28.76 |
77
  | **ProSparse-13B** | **88.80** | 44.90 | 28.42 | 69.76 | 66.91 | 26.31 | 54.35 | 39.90 | 28.67 |
78
  | MiniCPM-1B | - | 44.44 | 36.85 | 63.67 | 60.90 | 35.48 | 50.44 | 35.03 | 28.71 |
79
- | **ProSparse-1B**\* | 86.25 | **44.72** | 41.38 | 64.55 | 60.69 | 34.72 | 49.36 | 34.04 | 28.27 |
80
- | **ProSparse-1B** | **87.89** | **44.72** | 42.04 | 64.37 | 60.73 | 34.57 | 49.51 | 34.08 | 27.77 |
81
 
82
- **Notes**: "Original" refers to the original Swish-activated LLaMA2 versions. ReluLLaMA-7B and ReluLLaMA-13B are available at [7B](https://huggingface.co/SparseLLM/ReluLLaMA-7B) and [13B](https://huggingface.co/SparseLLM/ReluLLaMA-13B) respectively. MiniCPM-1B is available at [1B](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16). "ProSparse-7B\*", "ProSparse-13B\*", and "ProSparse-1B\*" denote the ProSparse versions without activation threshold shifting.
83
 
84
  ### Evaluation Issues with LM-Eval
85
 
 
10
  ---
11
 
12
 
13
+ # MiniCPM-S-1B-sft
14
 
15
  - Original model: [MiniCPM-1B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16)
16
  - Model creator and fine-tuned by: [ModelBest](https://modelbest.cn/), [OpenBMB](https://huggingface.co/openbmb), and [THUNLP](https://nlp.csai.tsinghua.edu.cn/)
17
  - Paper: [link](https://arxiv.org/pdf/2402.13516.pdf)
18
+ - Adapted LLaMA version: [MiniCPM-S-1B-sft-llama-format](https://huggingface.co/openbmb/MiniCPM-S-1B-sft-llama-format/)
19
+ - Note: `MiniCPM-S-1B` is denoted as `ProSparse-1B` in the paper.
20
 
21
  ### Introduction
22
 
 
77
  | **ProSparse-13B**\* | 87.97 | **45.07** | 29.03 | 69.75 | 67.54 | 25.40 | 54.78 | 40.20 | 28.76 |
78
  | **ProSparse-13B** | **88.80** | 44.90 | 28.42 | 69.76 | 66.91 | 26.31 | 54.35 | 39.90 | 28.67 |
79
  | MiniCPM-1B | - | 44.44 | 36.85 | 63.67 | 60.90 | 35.48 | 50.44 | 35.03 | 28.71 |
80
+ | **MiniCPM-S-1B**\* | 86.25 | **44.72** | 41.38 | 64.55 | 60.69 | 34.72 | 49.36 | 34.04 | 28.27 |
81
+ | **MiniCPM-S-1B** | **87.89** | **44.72** | 42.04 | 64.37 | 60.73 | 34.57 | 49.51 | 34.08 | 27.77 |
82
 
83
+ **Notes**: "Original" refers to the original Swish-activated LLaMA2 versions. ReluLLaMA-7B and ReluLLaMA-13B are available at [7B](https://huggingface.co/SparseLLM/ReluLLaMA-7B) and [13B](https://huggingface.co/SparseLLM/ReluLLaMA-13B) respectively. MiniCPM-1B is available at [1B](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16). "ProSparse-7B\*", "ProSparse-13B\*", and "MiniCPM-S-1B\*" denote the ProSparse versions without activation threshold shifting.
84
 
85
  ### Evaluation Issues with LM-Eval
86