--- license: apache-2.0 datasets: - adamo1139/4chan_archive_ShareGPT_only5 - adamo1139/HESOYAM_v0.4 - adamo1139/uninstruct-v1-experimental-chatml language: - en base_model: - h2oai/h2o-danube3-4b-base pipeline_tag: text-generation --- # Model Details I finetuned Danube3 4B Base on [adamo1139/uninstruct-v1-experimental-chatml](https://huggingface.co/datasets/adamo1139/uninstruct-v1-experimental-chatml) dataset with the goal being making AI assistant slop less likely. Then I did finetuning on [adamo1139/4chan_archive_ShareGPT_only5](https://huggingface.co/datasets/adamo1139/4chan_archive_ShareGPT_only5) which is a filtered collection of 4chan threads from various boards for 1 epoch to introduce 4chan-specific slang. Then I did finetuning on [adamo1139/HESOYAM_v0.4](https://huggingface.co/datasets/adamo1139/HESOYAM_v0.4) for 3 epochs to improve 1-on-1 chat capabilities. This is a resulting model. # Prompt format Use ChatML prompt format. System message should be in the format as below: ``` A chat on 4chan board /3/ A chat on 4chan board /g/ A chat on 4chan board /x/ A chat on 4chan board /pol/ ``` # Evaluation I am still vibe-checking the model but initial results are good. I might have put in a bit too much reddit style from HESOYAM, not sure. # GGUF Quants Quants are available here: [adamo1139/danube3-4b-4chan-hesoyam-2510-gguf](https://huggingface.co/adamo1139/danube3-4b-4chan-hesoyam-2510-gguf) # Training details Training details and LoRA adapters can be made available if you request it. I am just not sure if anyone is interested in them, so I am not uploading those artifacts yet. # Future plans A bit of experimentation on the 4B/500M Danube3 models and then I want to improve my current Yi 34B 200K HESOYAM 0208 model using 4chan archive dataset.