neurotechnology
/

Lt-Llama-2-13b-hf

Model card Files Files and versions Community

Lt-Llama-2-13b-hf / README.md

artena's picture

Update README.md

cb788cc verified 5 months ago

|

history blame contribute delete

3.65 kB

	---
	license: llama2
	language:
	- lt
	datasets:
	- uonlp/CulturaX
	---
	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	Lt-Llama2 is a family of pretrained and fine-tuned generative text models for Lithuanian. This is the repository for the foundational 7B model. Links to other models can be found at the bottom of this page.

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->
	Neurotechnology company marks the first open-source initiative dedicated to developing a large language model (LLM) specialized in Lithuanian. The company has created and publicly released a collection of Lithuanian LLMs, available both as foundational models and instructional variants.


	- Developed by: Neurotechnology
	<!-- - Funded by [optional]: [More Information Needed] -->
	<!-- - Shared by [optional]: [More Information Needed] -->
	<!-- - Model type: [More Information Needed] -->
	- Language(s): Lithuanian
	- License: Llama2 Community License Agreement
	- Continual pretrained from model: [Llama-2-13b](https://huggingface.co/meta-llama/Llama-2-13b-hf)

	### Model Sources

	<!-- Provide the basic links for the model. -->

	- Paper: https://arxiv.org/abs/2408.12963

	## Intended Use

	### Intended Use Cases

	Lt-Llama2 is designed for research purposes in Lithuanian. The base models can be tailored for various natural language tasks, while the instruction models are geared towards assistant-like conversational interactions.

	### Prohibited use

	<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

	Utilizing the model in ways that breach the license, violate any applicable laws or regulations, or involve languages other than Lithuanian.



	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	tokenizer = AutoTokenizer.from_pretrained("neurotechnology/Lt-Llama-2-13b-hf")
	model = AutoModelForCausalLM.from_pretrained("neurotechnology/Lt-Llama-2-13b-hf")
	input_text = "Kartą gyveno senelis ir senelė "
	input_ids = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**input_ids, max_new_tokens=100)
	print(tokenizer.decode(outputs[0]))
	```

	## Benchmarks

	\| Model \| Average \| ARC \| MMLU \|Winogrande\|HellaSwag \| GSM8k \|TruthfulQA\|
	\|--------------------\|:--------:\|:--------:\|:--------:\|:--------:\|:--------:\|:--------:\|:--------:\|
	\| Llama-2-13b \| 30.53 \| 28.66 \| 31.34 \| 50.90 \| 28.91 \| 5.91 \| 37.48 \|
	\| Llama2-13b-Base \| *36.42* \| *54.50* \| 26.01 \| *61.72* \| *40.61* \| 0.45 \| 35.23 \|



	## RoLlama2 Model Family

	\| Model \| Link \|
	\|--------------------\|:--------:\|
	\|Lt-Llama2-7b \| [link](https://huggingface.co/neurotechnology/Lt-Llama-2-7b-hf) \|
	\|Lt-Llama2-7b-instruct\| [link](https://huggingface.co/neurotechnology/Lt-Llama-2-7b-instruct-hf) \|
	\|Lt-Llama2-13b \| [link](https://huggingface.co/neurotechnology/Lt-Llama-2-13b-hf) \|
	\|Lt-Llama2-13b-instruct\| [link](https://huggingface.co/neurotechnology/Lt-Llama-2-13b-instruct-hf) \|


	## Citation

	```bibtext
	@misc{nakvosas2024openllama2modellithuanian,
	title={Open Llama2 Model for the Lithuanian Language},
	author={Artūras Nakvosas and Povilas Daniušis and Vytas Mulevičius},
	year={2024},
	eprint={2408.12963},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2408.12963},
	}
	```

	<!-- APA:

	[More Information Needed] -->