nicholasKluge
/

TeenyTinyLlama-460m-Chat-awq

Text Generation

instruction tuned

text generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

nicholasKluge commited on Jan 21, 2024

Commit

91f1762

·

verified ·

1 Parent(s): 748c8f7

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -45,8 +45,7 @@ co2_eq_emissions:
 ---
 # TeenyTinyLlama-460m-Chat-awq
-**Note: This model is a quantized version of [TeenyTinyLlama-460m-Chat](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m-Chat). Quantization was performed using [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), allowing this version to be 80% lighter with almost no performance loss. A GPU is required to run the AWQ-quantized models.**
 TeenyTinyLlama is a pair of small foundational models trained in Brazilian Portuguese.
 This repository contains a version of [TeenyTinyLlama-460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m) (`TeenyTinyLlama-460m-Chat`) fine-tuned on the [Instruct-Aira Dataset version 2.0](https://huggingface.co/datasets/nicholasKluge/instruct-aira-dataset-v2).

 ---
 # TeenyTinyLlama-460m-Chat-awq
+**Note: This model is a quantized version of [TeenyTinyLlama-460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m). Quantization was performed using [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), allowing this version to be 80% lighter, 20% faster, and with almost no performance loss. A GPU is required to run the AWQ-quantized models.**
 TeenyTinyLlama is a pair of small foundational models trained in Brazilian Portuguese.
 This repository contains a version of [TeenyTinyLlama-460m](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m) (`TeenyTinyLlama-460m-Chat`) fine-tuned on the [Instruct-Aira Dataset version 2.0](https://huggingface.co/datasets/nicholasKluge/instruct-aira-dataset-v2).