neoncortex
/

mini-mistral-openhermes-2.5-chatml-test

@@ -24,7 +24,7 @@ Just doing it to see what happens.
 It'll take about 40 to 45 hours to train on two Nvidia RTX 3060 12GB.
-It uses ChatML for the chat template, but I fucked up the template in the dataset,
 using '<|im_start|>human' instead of '<|im_start|>user'. ¯\_(ツ)_/¯
 So, here's the bits:
@@ -56,57 +56,30 @@ So, here's the bits:
 - **Shared by:** RoboApocalypse
 - **Model type:** Mistral
 - **Language(s) (NLP):** English, maybe others I dunno
-- **License:** OpenRAIL, IDGAF
 ### Model Sources
 Exclusively available right here on HuggingFace!
 - **Repository:** https://huggingface.co/neoncortex/mini-mistral-openhermes-2.5-chatml-test
-- **Paper:** LoL
-- **Demo:** Just download it in Oobabooga and use the modified chatML template above. Maybe I'll throw together a Space or something.
 ## Uses
-If you wanna have a laugh at how bad it is then go ahead, but I wouldn't expect much from it.
 ### Out-of-Scope Use
 This model won't work well for pretty much everything, probably.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing
-I took the OpenHermes 2.5 dataset and formatted it with ChatML.
 #### Training Hyperparameters
 - **Training regime:** bf16 mixed precision
-#### Speeds, Sizes, Times
-epochs: 9
-steps: 140976
-batches per device: 6
-1.04it/s
 ## Evaluation
 I tried to run evals but the eval suite just laughed at me.
@@ -117,11 +90,9 @@ Don't be rude.
 ## Environmental Impact
-- **Hardware Type:** I already told you. Try and keep up.
-- **Hours used:** ~45 x 2 I guess.
-- **Cloud Provider:** RoboApocalypse
-- **Compute Region:** myob
-- **Carbon Emitted:** Yes, definitely
 ### Compute Infrastructure
@@ -134,11 +105,3 @@ I trained it on my PC with no side on it because I like to watch the GPUs do the
 #### Software
 The wonderful free stuff at HuggingFace (https://huggingface.co)[https://huggingface.co]: transformers, datasets, trl
-## Model Card Authors
-RoboApocalypse, unless you're offended by something, in which case it was hacked by hackers.
-## Model Card Contact
-If you want to send me insults come find me on Reddit I guess.

 It'll take about 40 to 45 hours to train on two Nvidia RTX 3060 12GB.
+It uses ChatML for the chat template, but I messed up the template in the dataset,
 using '<|im_start|>human' instead of '<|im_start|>user'. ¯\_(ツ)_/¯
 So, here's the bits:
 - **Shared by:** RoboApocalypse
 - **Model type:** Mistral
 - **Language(s) (NLP):** English, maybe others I dunno
+- **License:** OpenRAIL
 ### Model Sources
 Exclusively available right here on HuggingFace!
 - **Repository:** https://huggingface.co/neoncortex/mini-mistral-openhermes-2.5-chatml-test
 ## Uses
+None
 ### Out-of-Scope Use
 This model won't work well for pretty much everything, probably.
 #### Preprocessing
+Format the OpenHermes 2.5 dataset with ChatML.
 #### Training Hyperparameters
 - **Training regime:** bf16 mixed precision
 ## Evaluation
 I tried to run evals but the eval suite just laughed at me.
 ## Environmental Impact
+- **Hardware Type:** 2 x NVIDIA RTX 3060 12GB
+- **Hours used:** ~45 x 2.
+- **Carbon Emitted:** [TBA]
 ### Compute Infrastructure
 #### Software
 The wonderful free stuff at HuggingFace (https://huggingface.co)[https://huggingface.co]: transformers, datasets, trl