File size: 1,341 Bytes
f857596
 
 
 
 
 
 
 
 
 
 
fd3e717
f857596
c3e3c5b
f857596
2acb867
 
ddb57a2
2acb867
18e7b31
2acb867
e404695
 
9703482
2acb867
2f03b9b
f857596
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
base_model:
- h2oai/h2o-danube3-500m-base
- appvoid/arco
library_name: transformers
tags:
- mergekit
- merge

---

# arco lite

arco lite is a passthrough arco model based on danube outputs to keep generality, even though its performance decreased, it's stil competitive to qwen2 at most benchmarks, being mmlu the only reason why is better on average. Note, arco-lite is still un-trained, i'm expecting it to be better after some iterations.

#### benchmarks

zero-shot evaluations, as you can see is smarter than qwen but without world knowledge, so don't use it for tasks that need factual output.

| Parameters | Model                          | MMLU  | ARC | HellaSwag | PIQA   | Winogrande | Average |
| -----------|--------------------------------|-------|-------|-----------|--------|------------|---------|
| 488m       | arco-lite                      | 23.22 | **33.45** | **56.55**| **69.70** | **59.19**| **48.46**  | 
| 494m       | qwen2                          |**44.13**| 28.92| 49.05    | 69.31 | 56.99  | **49.68**  |


#### Configuration

The following YAML configuration was used to produce this model:

```yaml
slices:
  - sources:
    - model: appvoid/arco
      layer_range: [0, 14]
  - sources:
    - model: h2oai/h2o-danube3-500m-base
      layer_range: [15, 16]

merge_method: passthrough
dtype: float16

```