File size: 3,078 Bytes
fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d f42df00 fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d fe49583 8561f5d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
---
license: llama3.2
language:
- en
- ja
- de
- fr
- it
- pt
- hi
- es
- th
library_name: transformers
pipeline_tag: text-generation
base_model: meta-llama/Llama-3.2-3B
datasets:
- ryota39/izumi-lab-dpo-45k
- Aratako/Magpie-Tanuki-8B-97k
- kunishou/databricks-dolly-15k-ja
- kunishou/oasst1-89k-ja
tags:
- llama3.2
---
![chibi-img](./chibi.png)
## Preface
The importance of a small parameter large language model (LLM) lies in its ability to balance performance and efficiency. As LLMs grow increasingly sophisticated, the trade-off between model size and computational resource demands becomes critical. A smaller parameter model offers significant advantages, such as reduced memory usage, faster inference times, and lower energy consumption, all while retaining a high level of accuracy and contextual understanding. These models are particularly valuable in real-world applications where resources like processing power and storage are limited, such as on mobile devices, edge computing, or low-latency environments.
## Llama 3.2 Chibi 3B
This experimental model is the result from continual pre-training of [Meta's Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on a small mixture of japanese datasets.
## Architecture
[Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B)
## Training
The model has been trained with a following mixture of datasets:
- [ryota39/izumi-lab-dpo-45k](https://huggingface.co/datasets/ryota39/izumi-lab-dpo-45k)
- [Aratako/Magpie-Tanuki-8B-97k](https://huggingface.co/datasets/Aratako/Magpie-Tanuki-8B-97k)
- [kunishou/databricks-dolly-15k-ja](https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja)
- [kunishou/oasst1-89k-ja](https://huggingface.co/datasets/kunishou/oasst1-89k-ja)
## Contributors
- [Hammaam](https://huggingface.co/AELLM)
## How to use
Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
Make sure to update your transformers installation via pip install --upgrade transformers.
```python
import torch
from transformers import pipeline
model_id = "AELLM/Llama-3.2-Chibi-3B"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
pipe("人生の鍵は")
```
# License
Refer to [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
# References
```bibtex
@inproceedings{zheng2024llamafactory,
title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Zhangchi Feng and Yongqiang Ma},
booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
address={Bangkok, Thailand},
publisher={Association for Computational Linguistics},
year={2024},
url={http://arxiv.org/abs/2403.13372}
}
``` |