KAKA22
/

CodeRM-8B

@@ -31,6 +31,20 @@ You can also visit the [homepage](https://code-reward-model.github.io/) and the
 The model is trained based on [Llama3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct).
 # Performance
 ## Best-of-N
@@ -86,20 +100,6 @@ utilizing Llama3.1-8B as the policy model. The top two performances are marked i
 | Llama3.1-70B        | <u>78.30</u>       | <u>78.76</u>       | <u>17.19</u>       | <u>25.97</u>       |
 | *CodeRM-8B (Ours)*      | **80.46**     | **81.27**     | **16.48**     | **22.71**     |
-# Prompt Format
-```
-Below is a question and it's corresponding code answer. Please write test cases to check the correctness of the code answer. You need to use the unittest library in Python and create a test class for testing.
-### question
-{question}
-### code solution
-{code in function format}
-Please add detailed comments to the test cases you write. You do not need to test the function's ability to throw exceptions.
-```
 # Citation
 If you find our model helpful, please cite the original paper:

 The model is trained based on [Llama3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct).
+# Prompt Format
+```
+Below is a question and it's corresponding code answer. Please write test cases to check the correctness of the code answer. You need to use the unittest library in Python and create a test class for testing.
+### question
+{question}
+### code solution
+{code in function format}
+Please add detailed comments to the test cases you write. You do not need to test the function's ability to throw exceptions.
+```
 # Performance
 ## Best-of-N
 | Llama3.1-70B        | <u>78.30</u>       | <u>78.76</u>       | <u>17.19</u>       | <u>25.97</u>       |
 | *CodeRM-8B (Ours)*      | **80.46**     | **81.27**     | **16.48**     | **22.71**     |
 # Citation
 If you find our model helpful, please cite the original paper: