|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
|
|
<style> |
|
img{ |
|
user-select: none; |
|
transition: all 0.2s ease; |
|
border-radius: .5rem; |
|
} |
|
img:hover{ |
|
transform: rotate(2deg); |
|
filter: invert(100%); |
|
} |
|
@import url('https://fonts.googleapis.com/css2?family=Vollkorn:ital,wght@0,400..900;1,400..900&display=swap'); |
|
</style> |
|
|
|
<div style="background-color: transparent; border-radius: .5rem; padding: 2rem; font-family: monospace; font-size: .85rem; text-align: justify;"> |
|
|
|
![cubby](https://huggingface.co/appvoid/cubby/resolve/main/cubby.webp) |
|
|
|
This is the latest iteration as an effort to make arco as good on arc as it can. So far it improved a little. |
|
|
|
#### prompt |
|
|
|
there is no prompt intentionally set. |
|
|
|
|
|
#### benchmarks |
|
|
|
zero-shot results from state-of-the-art small language models |
|
|
|
| Parameters | Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average | |
|
| -----------|--------------------------------|-------|-------|-----------|--------|------------|---------| |
|
| 0.5b | danube 3 | 24.81| 36.18| 60.46| 73.78 | 61.01 | 51.25 | |
|
| 0.5b | arco |**26.17**|37.29|62.88|74.37|**62.27**|52.60| |
|
| 0.5b | arco 2 |25.51|38.82|63.02|**74.70**|61.25|**52.66**| |
|
| 0.5b | arco 2º |25.47|**38.99**|**63.03**|**74.70**|61.01|52.64| |
|
#### supporters |
|
|
|
<a href="https://ko-fi.com/appvoid" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 34px !important; margin-top: -4px;width: 128px !important; filter: contrast(2) grayscale(100%) brightness(100%);" ></a> |
|
|
|
### trivia |
|
|
|
arco seems to keep improving on the same 3 benchmarks, reached its limit though. |
|
</div> |