appvoid commited on
Commit
580f1f3
·
verified ·
1 Parent(s): f88837f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -33
README.md CHANGED
@@ -1,43 +1,47 @@
1
  ---
2
- base_model:
3
- - appvoid/arco-2
4
- - appvoid/arco-exp-12
5
- - appvoid/arco-2-reasoning-20k
6
- - appvoid/text-arco
7
- library_name: transformers
8
- tags:
9
- - mergekit
10
- - merge
11
-
12
  ---
13
- # merge
14
 
15
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
 
17
- ## Merge Details
18
- ### Merge Method
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [appvoid/arco-exp-12](https://huggingface.co/appvoid/arco-exp-12) as a base.
21
 
22
- ### Models Merged
23
 
24
- The following models were included in the merge:
25
- * [appvoid/arco-2](https://huggingface.co/appvoid/arco-2)
26
- * [appvoid/arco-2-reasoning-20k](https://huggingface.co/appvoid/arco-2-reasoning-20k)
27
- * [appvoid/text-arco](https://huggingface.co/appvoid/text-arco)
 
 
 
28
 
29
- ### Configuration
30
 
31
- The following YAML configuration was used to produce this model:
32
 
33
- ```yaml
34
- models:
35
- - model: appvoid/arco-2-reasoning-20k
36
- - model: appvoid/arco-2
37
- - model: appvoid/text-arco
38
- merge_method: model_stock
39
- base_model: appvoid/arco-exp-12
40
- normalize: false
41
- int8_mask: true
42
- dtype: float16
43
- ```
 
1
  ---
2
+ license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
 
4
 
 
5
 
6
+ <style>
7
+ img{
8
+ user-select: none;
9
+ transition: all 0.2s ease;
10
+ border-radius: .5rem;
11
+ }
12
+ img:hover{
13
+ transform: rotate(2deg);
14
+ filter: invert(100%);
15
+ }
16
+ @import url('https://fonts.googleapis.com/css2?family=Vollkorn:ital,wght@0,400..900;1,400..900&display=swap');
17
+ </style>
18
+
19
+ <div style="background-color: transparent; border-radius: .5rem; padding: 2rem; font-family: monospace; font-size: .85rem; text-align: justify;">
20
+
21
+ ![cubby](https://huggingface.co/appvoid/cubby/resolve/main/cubby.webp)
22
+
23
+ This is the latest iteration as an effort to make arco as good on arc as it can. So far it improved a little.
24
+
25
+ #### prompt
26
+
27
+ there is no prompt intentionally set.
28
+
29
 
30
+ #### benchmarks
31
 
32
+ zero-shot results from state-of-the-art small language models
33
 
34
+ | Parameters | Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average |
35
+ | -----------|--------------------------------|-------|-------|-----------|--------|------------|---------|
36
+ | 0.5b | danube 3 | 24.81| 36.18| 60.46| 73.78 | 61.01 | 51.25 |
37
+ | 0.5b | arco |**26.17**|37.29|62.88|74.37|**62.27**|52.60|
38
+ | 0.5b | arco 2 |25.51|38.82|63.02|**74.70**|61.25|**52.66**|
39
+ | 0.5b | arco 2º |25.47|**38.99**|**63.03**|**74.70**|61.01|52.64|
40
+ #### supporters
41
 
42
+ <a href="https://ko-fi.com/appvoid" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 34px !important; margin-top: -4px;width: 128px !important; filter: contrast(2) grayscale(100%) brightness(100%);" ></a>
43
 
44
+ ### trivia
45
 
46
+ arco seems to keep improving on the same 3 benchmarks, hellaswag reached its limit though.
47
+ </div>