Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,11 @@ library_name: transformers
|
|
30 |
tags:
|
31 |
- code
|
32 |
- art
|
|
|
|
|
|
|
|
|
|
|
33 |
---
|
34 |
|
35 |
This is [NinjaMouse](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube) extended even further. Instead of Cosmopedia I used different coding datasets.
|
|
|
30 |
tags:
|
31 |
- code
|
32 |
- art
|
33 |
+
---
|
34 |
+
#### ❗ This model gives up when the input reaches a critical mass of about tree fiddy thousand tokens
|
35 |
+
|
36 |
+
I have dun goofed and not tested the [base model](https://huggingface.co/h2oai/h2o-danube-1.8b-chat) enough (and possibly goofed in other ways too), but I'm already training the new one based on [h2oai/h2o-danube2-1.8b-chat](https://huggingface.co/h2oai/h2o-danube2-1.8b-chat). Perhaps S² attn or RoPE scaling will work and make a hella big context window possible? We'll see.
|
37 |
+
|
38 |
---
|
39 |
|
40 |
This is [NinjaMouse](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube) extended even further. Instead of Cosmopedia I used different coding datasets.
|