louisbrulenaudet commited on
Commit
79e733d
·
verified ·
1 Parent(s): 2ef64c9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -14
README.md CHANGED
@@ -3,8 +3,6 @@ tags:
3
  - merge
4
  - mergekit
5
  - lazymergekit
6
- - mlabonne/OmniBeagle-7B
7
- - WizardLM/WizardMath-7B-V1.1
8
  - Maths
9
  base_model:
10
  - mlabonne/OmniBeagle-7B
@@ -25,6 +23,19 @@ Pearl-7B-slerp is a merge of the following models using [LazyMergekit](https://c
25
  * [mlabonne/OmniBeagle-7B](https://huggingface.co/mlabonne/OmniBeagle-7B)
26
  * [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  Spherical Linear Interpolation (SLERP) serves as a technique for seamlessly interpolating between two vectors while maintaining a constant rate of change and upholding the geometric properties of the spherical space in which these vectors exist.
29
 
30
  Opting for SLERP over traditional linear interpolation is motivated by various considerations. Linear interpolation in high-dimensional spaces may result in a reduction in the magnitude of the interpolated vector, diminishing the scale of weights. Additionally, in many cases, the alteration in the weights' direction conveys more meaningful information, such as feature learning and representation, compared to the magnitude of change.
@@ -37,18 +48,6 @@ The implementation of SLERP involves the following steps:
37
 
38
  In essence, SLERP provides a robust mechanism for interpolating vectors, offering advantages in preserving directional information and mitigating issues associated with linear interpolation in high-dimensional spaces.
39
 
40
- ## Evaluation
41
-
42
- The evaluation was performed using the HuggingFace Open LLM Leaderboard.
43
-
44
- | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | #Params (B) |
45
- |-------------------------------------------|------------|-------|-----------|-------|------------|------------|-------|--------------|
46
- | **louisbrulenaudet/Pearl-7B-slerp** |**72.75** | 68.00 | 87.16 | 64.04 | 62.35 | 81.29 |**73.62**| 7.24 |
47
- | mistralai/Mixtral-8x7B-Instruct-v0.1 | 72.62 | 70.22 | 87.63 | 71.16 | 64.58 | 81.37 | 60.73 | 46.7 |
48
- | microsoft/phi-2 | 61.33 | 61.09 | 75.11 | 58.11 | 44.47 | 74.35 | 54.81 | 2.78 |
49
- | microsoft/Orca-2-13b | 58.64 | 60.67 | 79.81 | 60.37 | 56.41 | 76.64 | 17.97 | 13 |
50
- | mistralai/Mistral-7B-Instruct-v0.1 | 54.96 | 54.52 | 75.63 | 55.38 | 56.28 | 73.72 | 14.25 | 7.24 |
51
- | meta-llama/Llama-2-7b-hf | 50.97 | 53.07 | 78.59 | 46.87 | 38.76 | 74.03 | 14.48 | 6.74 |
52
 
53
  ## Configuration
54
 
 
3
  - merge
4
  - mergekit
5
  - lazymergekit
 
 
6
  - Maths
7
  base_model:
8
  - mlabonne/OmniBeagle-7B
 
23
  * [mlabonne/OmniBeagle-7B](https://huggingface.co/mlabonne/OmniBeagle-7B)
24
  * [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
25
 
26
+ ### Evaluation
27
+
28
+ The evaluation was performed using the HuggingFace Open LLM Leaderboard.
29
+
30
+ | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | #Params (B) |
31
+ |-------------------------------------------|------------|-------|-----------|-------|------------|------------|-------|--------------|
32
+ | **louisbrulenaudet/Pearl-7B-slerp** |**72.75** | 68.00 | 87.16 | 64.04 | 62.35 | 81.29 |**73.62**| 7.24 |
33
+ | mistralai/Mixtral-8x7B-Instruct-v0.1 | 72.62 | 70.22 | 87.63 | 71.16 | 64.58 | 81.37 | 60.73 | 46.7 |
34
+ | microsoft/phi-2 | 61.33 | 61.09 | 75.11 | 58.11 | 44.47 | 74.35 | 54.81 | 2.78 |
35
+ | microsoft/Orca-2-13b | 58.64 | 60.67 | 79.81 | 60.37 | 56.41 | 76.64 | 17.97 | 13 |
36
+ | mistralai/Mistral-7B-Instruct-v0.1 | 54.96 | 54.52 | 75.63 | 55.38 | 56.28 | 73.72 | 14.25 | 7.24 |
37
+ | meta-llama/Llama-2-7b-hf | 50.97 | 53.07 | 78.59 | 46.87 | 38.76 | 74.03 | 14.48 | 6.74 |
38
+
39
  Spherical Linear Interpolation (SLERP) serves as a technique for seamlessly interpolating between two vectors while maintaining a constant rate of change and upholding the geometric properties of the spherical space in which these vectors exist.
40
 
41
  Opting for SLERP over traditional linear interpolation is motivated by various considerations. Linear interpolation in high-dimensional spaces may result in a reduction in the magnitude of the interpolated vector, diminishing the scale of weights. Additionally, in many cases, the alteration in the weights' direction conveys more meaningful information, such as feature learning and representation, compared to the magnitude of change.
 
48
 
49
  In essence, SLERP provides a robust mechanism for interpolating vectors, offering advantages in preserving directional information and mitigating issues associated with linear interpolation in high-dimensional spaces.
50
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
  ## Configuration
53