LavaPlanet
commited on
Commit
·
95b5ce4
1
Parent(s):
fa26635
Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,16 @@
|
|
1 |
---
|
2 |
license: llama2
|
3 |
---
|
4 |
-
|
5 |
-
Another EXL2 version of https://huggingface.co/alpindale/goliath-120b this one being at 2.37BPW.
|
6 |
Pippa llama2 Chat was used as the calibration dataset.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: llama2
|
3 |
---
|
4 |
+
Another EXL2 version of AlpinDale's https://huggingface.co/alpindale/goliath-120b this one being at 2.37BPW.
|
|
|
5 |
Pippa llama2 Chat was used as the calibration dataset.
|
6 |
+
|
7 |
+
Assuming Windows overhead, the following figures should be more or less close enough for estimation of your own use.
|
8 |
+
•
|
9 |
+
2.37BPW @ 4096 ctx
|
10 |
+
Empty ctx
|
11 |
+
GPU split: 16/24
|
12 |
+
GPU1: 17.4/24GB
|
13 |
+
GPU2: 19.5/24GB
|
14 |
+
11~ tk/s
|
15 |
+
3000+ ctx
|
16 |
+
8~-12 tk/s
|