Post
1650
๐ Let me introduce the work I've done over the past three months: ๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ฎ-๐ง๐ฎ๐ถ๐๐ฎ๐ป-๐ฏ๐ and ๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ฎ-๐ง๐ฎ๐ถ๐๐ฎ๐ป-๐ฏ๐-๐๐ป๐๐๐ฟ๐๐ฐ๐, now open-sourced on ๐ค Hugging Face.
๐น๐ถ๐ฎ๐ป๐ด๐ต๐๐๐ป/๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ฎ-๐ง๐ฎ๐ถ๐๐ฎ๐ป-๐ฏ๐: This model is built on top of ๐บ๐ฒ๐๐ฎ-๐น๐น๐ฎ๐บ๐ฎ/๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ฎ-๐ฏ๐ with continual pretraining. The training dataset consists of a mixture of Traditional Chinese and multilingual texts in specific proportions, including 20B tokens of Traditional Chinese text.
๐น๐ถ๐ฎ๐ป๐ด๐ต๐๐๐ป/๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ฎ-๐ง๐ฎ๐ถ๐๐ฎ๐ป-๐ฏ๐-๐๐ป๐๐๐ฟ๐๐ฐ๐: This is a fine-tuned conversational model based on the foundation model.
This Llama-3.2-Taiwan open-source project is currently a one-person effort (yes, I did everything from text preparation โ so exhausting!). If you're interested, feel free to join the Discord server for discussions.
๐ ฑ๐ ด๐ ฝ๐ ฒ๐ ท๐ ผ๐ ฐ๐๐ บ๐ ธ๐ ฝ๐ ถ
The evaluation was conducted using ikala/tmmluplus, though the README page does not yet reflect the latest results. The performance is close to the previous versions, indicating that further improvements might require adding more specialized knowledge in the datasets.
๐ ฐ ๐ ฒ๐ ฐ๐ ป๐ ป ๐ ต๐ พ๐ ๐๐๐ ฟ๐ ฟ๐ พ๐๐
If anyone is willing to provide compute resources, it would be greatly appreciated to help this project continue and grow. ๐ช
---
๐๏ธ Foundation model: lianghsun/Llama-3.2-Taiwan-3B
๐ค Instruction model: lianghsun/Llama-3.2-Taiwan-3B-Instruct
โก GGUF: lianghsun/Llama-3.2-Taiwan-3B-Instruct-GGUF
๐น๐ถ๐ฎ๐ป๐ด๐ต๐๐๐ป/๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ฎ-๐ง๐ฎ๐ถ๐๐ฎ๐ป-๐ฏ๐: This model is built on top of ๐บ๐ฒ๐๐ฎ-๐น๐น๐ฎ๐บ๐ฎ/๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ฎ-๐ฏ๐ with continual pretraining. The training dataset consists of a mixture of Traditional Chinese and multilingual texts in specific proportions, including 20B tokens of Traditional Chinese text.
๐น๐ถ๐ฎ๐ป๐ด๐ต๐๐๐ป/๐๐น๐ฎ๐บ๐ฎ-๐ฏ.๐ฎ-๐ง๐ฎ๐ถ๐๐ฎ๐ป-๐ฏ๐-๐๐ป๐๐๐ฟ๐๐ฐ๐: This is a fine-tuned conversational model based on the foundation model.
This Llama-3.2-Taiwan open-source project is currently a one-person effort (yes, I did everything from text preparation โ so exhausting!). If you're interested, feel free to join the Discord server for discussions.
๐ ฑ๐ ด๐ ฝ๐ ฒ๐ ท๐ ผ๐ ฐ๐๐ บ๐ ธ๐ ฝ๐ ถ
The evaluation was conducted using ikala/tmmluplus, though the README page does not yet reflect the latest results. The performance is close to the previous versions, indicating that further improvements might require adding more specialized knowledge in the datasets.
๐ ฐ ๐ ฒ๐ ฐ๐ ป๐ ป ๐ ต๐ พ๐ ๐๐๐ ฟ๐ ฟ๐ พ๐๐
If anyone is willing to provide compute resources, it would be greatly appreciated to help this project continue and grow. ๐ช
---
๐๏ธ Foundation model: lianghsun/Llama-3.2-Taiwan-3B
๐ค Instruction model: lianghsun/Llama-3.2-Taiwan-3B-Instruct
โก GGUF: lianghsun/Llama-3.2-Taiwan-3B-Instruct-GGUF