Safetensors
Lithuanian
llama
Lt-Llama-2-13b-hf / README.md
artena's picture
Update README.md
cb788cc verified
---
license: llama2
language:
- lt
datasets:
- uonlp/CulturaX
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
Lt-Llama2 is a family of pretrained and fine-tuned generative text models for Lithuanian. This is the repository for the **foundational 7B model**. Links to other models can be found at the bottom of this page.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
Neurotechnology company marks the first open-source initiative dedicated to developing a large language model (LLM) specialized in Lithuanian. The company has created and publicly released a collection of Lithuanian LLMs, available both as foundational models and instructional variants.
- **Developed by:** Neurotechnology
<!-- - **Funded by [optional]:** [More Information Needed] -->
<!-- - **Shared by [optional]:** [More Information Needed] -->
<!-- - **Model type:** [More Information Needed] -->
- **Language(s):** Lithuanian
- **License:** Llama2 Community License Agreement
- **Continual pretrained from model:** [Llama-2-13b](https://huggingface.co/meta-llama/Llama-2-13b-hf)
### Model Sources
<!-- Provide the basic links for the model. -->
- **Paper:** https://arxiv.org/abs/2408.12963
## Intended Use
### Intended Use Cases
Lt-Llama2 is designed for research purposes in Lithuanian. The base models can be tailored for various natural language tasks, while the instruction models are geared towards assistant-like conversational interactions.
### Prohibited use
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
Utilizing the model in ways that breach the license, violate any applicable laws or regulations, or involve languages other than Lithuanian.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("neurotechnology/Lt-Llama-2-13b-hf")
model = AutoModelForCausalLM.from_pretrained("neurotechnology/Lt-Llama-2-13b-hf")
input_text = "Kartą gyveno senelis ir senelė "
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))
```
## Benchmarks
| Model | Average | ARC | MMLU |Winogrande|HellaSwag | GSM8k |TruthfulQA|
|--------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
| Llama-2-13b | 30.53 | 28.66 | **31.34** | 50.90 | 28.91 | **5.91** | **37.48** |
| *Llama2-13b-Base* | ***36.42*** | ***54.50*** | *26.01* | ***61.72*** | ***40.61*** | *0.45* | *35.23* |
## RoLlama2 Model Family
| Model | Link |
|--------------------|:--------:|
|Lt-Llama2-7b | [link](https://huggingface.co/neurotechnology/Lt-Llama-2-7b-hf) |
|Lt-Llama2-7b-instruct| [link](https://huggingface.co/neurotechnology/Lt-Llama-2-7b-instruct-hf) |
|*Lt-Llama2-13b* | [link](https://huggingface.co/neurotechnology/Lt-Llama-2-13b-hf) |
|Lt-Llama2-13b-instruct| [link](https://huggingface.co/neurotechnology/Lt-Llama-2-13b-instruct-hf) |
## Citation
```bibtext
@misc{nakvosas2024openllama2modellithuanian,
title={Open Llama2 Model for the Lithuanian Language},
author={Artūras Nakvosas and Povilas Daniušis and Vytas Mulevičius},
year={2024},
eprint={2408.12963},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.12963},
}
```
<!-- **APA:**
[More Information Needed] -->