|
--- |
|
license: mit |
|
--- |
|
|
|
# Model description |
|
|
|
LLAMA2-stablebeluga-Q4_0/Q8_0 GGML is a language model trained by Stability AI on top of Meta AI. This model is based on the original LLAMA-2, but with a couple of key changes. It has been converted to F32 before being quantized to 4 bits. These alterations make the model more efficient in terms of memory and computational requirements, without significantly compromising its language understanding and generation capabilities. |
|
|
|
# Intended uses & limitations |
|
|
|
## How to use |
|
|
|
This model can be used with llama.cpp (or similar) for a variety of natural language understanding and generation tasks. These include, but are not limited to, text completion, text generation, conversation modeling, and semantic similarity estimation. |
|
|
|
## Limitations and bias |
|
|
|
While this model is designed to understand and generate human-like text, it has a few limitations: |
|
|
|
1. It might generate incorrect or nonsensical responses if the input prompt is ambiguous or lacks sufficient context. |
|
2. It is based on the data it was trained on and therefore might reflect the biases present in those data. |
|
3. Despite the conversion and quantization, this model might still require substantial computational resources for large-scale tasks. |
|
|
|
# Training data |
|
|
|
LLAMA-2-Q4_0/Q8_0 GGML model was trained on the same data as the original LLAMA-2 by Stability AI (Stable Beluga). For more details, please refer to the Stable Beluga 2's model card. |
|
|
|
# Evaluations |
|
|
|
The performance is similar to that of the original LLAMA2-stablebeluga, with a slight drop due to the quantization process. More specific evaluation results will be added as they become available. |