lunahr
/

thea-pro-2b-100r

Text Generation

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

thea-pro-2b-100r / README.md

lunahr's picture

no code yet

7623537 verified 12 days ago

|

1.26 kB

	---
	language:
	- en
	- ko
	license: other
	license_name: exaone
	license_link: LICENSE
	tags:
	- text-generation-inference
	- transformers
	- trl
	- sft
	- reasoning
	- lg-ai
	- exaone
	- exaone-3.5
	- o1
	base_model: LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct
	datasets:
	- KingNish/reasoning-base-20k
	---

	# Model Description

	An uncensored reasoning EXAONE 3.5 model trained on reasoning data. Now with a full epoch!

	It has been trained using improved training code, and gives an improved performance.
	Here is what inference code you should use:
	```py
	# DEBUGGING IN PROGRESS, check later
	```

	- Trained by: [Piotr Zalewski](https://huggingface.co/lunahr)
	- License: exaone
	- Finetuned from model: [LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct](https://huggingface.co/LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct)
	- Dataset used: [KingNish/reasoning-base-20k](https://huggingface.co/datasets/KingNish/reasoning-base-20k)

	This Llama model was trained faster than [Unsloth](https://github.com/unslothai/unsloth) using [custom training code](https://www.kaggle.com/code/piotr25691/distributed-hf-training-with-2xt4).

	Visit https://www.kaggle.com/code/piotr25691/distributed-hf-training-with-2xt4 to find out how you can finetune your models using BOTH of the Kaggle provided GPUs.