Model Card for Model ID

Model Details

Model Description

RoQLlama is a new lightweight Romanian language-adapted LLM with 7 billion parameters and quantized to 4 bits by employing the state-of-the-art quantized LoRA (QLoRA) training technique.

  • Language: Romanian
  • License: Llama2 Community License Agreement
  • Finetuned from model: Meta's Llama2 7B

Model Sources

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "andreidima/Llama-2-7b-Romanian-qlora"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, device_map="auto")
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto")

input_text = """Eu r膬spund la 卯ntreb膬ri pe baza contextului.
Context: 脦n anul 1600, Mihai Viteazul a realizat prima unire a 葰膬rilor Rom芒ne: 葰ara Rom芒neasc膬, Transilvania 葯i Moldova. Aceast膬 unire a fost un moment important 卯n istoria Rom芒niei.
脦ntrebare: 脦n ce an a realizat Mihai Viteazul prima unire a 葰膬rilor Rom芒ne?
R膬spuns: """
input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(
    **input_ids, 
    max_new_tokens=100, 
    eos_token_id=[13] # 13 is the token ID for a newline character at the end of a non-empty line
)
print(tokenizer.decode(outputs[0]))

Note: Adding a space at the end of the prompt has been observed to significantly improve the model's output quality.

Training Details and Evaluation

Please refer to the paper for details on the model's training and evaluation.

Citation

BibTeX:

@inproceedings{dima2024roqllama,
      title={RoQLlama: A Lightweight Romanian Adapted Language Model}, 
      author={George-Andrei Dima and Andrei-Marius Avram and Cristian-George Cr膬ciun and Dumitru-Clementin Cercel},
      booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024}
      year={2024},
      url={https://arxiv.org/abs/2410.04269}, 
}

APA:

Dima, G. A., Avram, A. M., Cr膬ciun, C. G., & Cercel, D. C. (2024). RoQLlama: A lightweight Romanian adapted language model. In Findings of the Association for Computational Linguistics: EMNLP 2024.

Downloads last month
15
Safetensors
Model size
3.6B params
Tensor type
F32
FP16
U8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.