mikesong724
commited on
Commit
·
2e299be
1
Parent(s):
eb6a982
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
DeBERTa trained from scratch
|
2 |
+
|
3 |
+
continued training from https://huggingface.co/mikesong724/deberta-wiki-2006
|
4 |
+
|
5 |
+
Source data: https://dumps.wikimedia.org/archive/2010/
|
6 |
+
|
7 |
+
Tools used: https://github.com/mikesong724/Point-in-Time-Language-Model
|
8 |
+
|
9 |
+
2010 wiki archive 6.1 GB trained 18 epochs = 108GB + 2006 (65GB)
|
10 |
+
|
11 |
+
GLUE benchmark
|
12 |
+
|
13 |
+
cola (3e): matthews corr: 0.3640
|
14 |
+
|
15 |
+
sst2 (3e): acc: 0.9106
|
16 |
+
|
17 |
+
mrpc (5e): F1: 0.8505, acc: 0.7794
|
18 |
+
|
19 |
+
stsb (3e): pearson: 0.8339, spearman: 0.8312
|
20 |
+
|
21 |
+
qqp (3e): acc: 0.8965, F1: 0.8604
|
22 |
+
|
23 |
+
mnli (3e): acc_mm: 0.8023
|
24 |
+
|
25 |
+
qnli (3e): acc: 0.8889
|
26 |
+
|
27 |
+
rte (3e): acc: 0.5271
|
28 |
+
|
29 |
+
wnli (5e): acc: 0.3380
|