sabersaleh
/

Llama2-7B-KTO

Model card Files Files and versions Community

Llama2-7B-KTO / README.md

sabersaleh's picture

Create README.md

60ebb9b verified 2 months ago

|

history blame contribute delete

472 Bytes

	---
	license: mit
	datasets:
	- tatsu-lab/alpaca
	base_model:
	- meta-llama/Llama-2-7b
	---

	This model is aligned using the AlpacaFarm dataset, fine-tuned through the Kahneman-Tversky Optimization (KTO) loss. The alignment process started from the Supervised Fine-Tuned (SFT) version of LLaMA 2 7B. The optimization process was conducted with a single epoch. For more information on the dataset, refer to the AlpacaFarm documentation (https://github.com/tatsu-lab/alpaca_farm).