File size: 1,341 Bytes
f857596 fd3e717 f857596 c3e3c5b f857596 2acb867 ddb57a2 2acb867 18e7b31 2acb867 e404695 9703482 2acb867 2f03b9b f857596 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
---
base_model:
- h2oai/h2o-danube3-500m-base
- appvoid/arco
library_name: transformers
tags:
- mergekit
- merge
---
# arco lite
arco lite is a passthrough arco model based on danube outputs to keep generality, even though its performance decreased, it's stil competitive to qwen2 at most benchmarks, being mmlu the only reason why is better on average. Note, arco-lite is still un-trained, i'm expecting it to be better after some iterations.
#### benchmarks
zero-shot evaluations, as you can see is smarter than qwen but without world knowledge, so don't use it for tasks that need factual output.
| Parameters | Model | MMLU | ARC | HellaSwag | PIQA | Winogrande | Average |
| -----------|--------------------------------|-------|-------|-----------|--------|------------|---------|
| 488m | arco-lite | 23.22 | **33.45** | **56.55**| **69.70** | **59.19**| **48.46** |
| 494m | qwen2 |**44.13**| 28.92| 49.05 | 69.31 | 56.99 | **49.68** |
#### Configuration
The following YAML configuration was used to produce this model:
```yaml
slices:
- sources:
- model: appvoid/arco
layer_range: [0, 14]
- sources:
- model: h2oai/h2o-danube3-500m-base
layer_range: [15, 16]
merge_method: passthrough
dtype: float16
```
|