billm-llama-7b-conll03-ner

https://arxiv.org/abs/2310.01208 https://arxiv.org/abs/2311.05296

This model is a fine-tuned version of NousResearch/Llama-2-7b-hf using BiLLM.

It achieves the following results on the evaluation set:

  • Loss: 0.1664
  • Precision: 0.9243
  • Recall: 0.9395
  • F1: 0.9319
  • Accuracy: 0.9860

Inference

python -m pip install -U billm==0.1.1
from transformers import AutoTokenizer, pipeline
from peft import PeftModel, PeftConfig
from billm import MistralForTokenClassification


label2id = {'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}
id2label = {v: k for k, v in label2id.items()}
model_id = 'WhereIsAI/billm-llama-7b-conll03-ner'
tokenizer = AutoTokenizer.from_pretrained(model_id)
peft_config = PeftConfig.from_pretrained(model_id)
model = MistralForTokenClassification.from_pretrained(
    peft_config.base_model_name_or_path,
    num_labels=len(label2id), id2label=id2label, label2id=label2id
)
model = PeftModel.from_pretrained(model, model_id)
# merge_and_unload is necessary for inference
model = model.merge_and_unload()

token_classifier = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
sentence = "I live in Hong Kong. I am a student at Hong Kong PolyU."
tokens = token_classifier(sentence)
print(tokens)

Training Details

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.048 1.0 1756 0.0971 0.8935 0.9082 0.9008 0.9813
0.0217 2.0 3512 0.0963 0.9182 0.9301 0.9241 0.9852
0.0113 3.0 5268 0.1081 0.9265 0.9348 0.9306 0.9858
0.0038 4.0 7024 0.1477 0.9216 0.9379 0.9297 0.9858
0.0016 5.0 8780 0.1617 0.9199 0.9370 0.9284 0.9855
0.0007 6.0 10536 0.1618 0.9235 0.9390 0.9312 0.9859
0.0005 7.0 12292 0.1644 0.9245 0.9395 0.9319 0.9860
0.0004 8.0 14048 0.1662 0.9248 0.9393 0.9320 0.9861
0.0003 9.0 15804 0.1664 0.9248 0.9395 0.9321 0.9861
0.0003 10.0 17560 0.1664 0.9243 0.9395 0.9319 0.9860

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.2
  • Pytorch 2.0.1
  • Datasets 2.16.0
  • Tokenizers 0.15.0

Citation

@inproceedings{li2024bellm,
    title = "BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings",
    author = "Li, Xianming and Li, Jing",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
    year = "2024",
    publisher = "Association for Computational Linguistics"
}

@article{li2023label,
  title={Label supervised llama finetuning},
  author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
  journal={arXiv preprint arXiv:2310.01208},
  year={2023}
}
Downloads last month
12
Inference Examples
Inference API (serverless) does not yet support peft models for this pipeline type.

Model tree for WhereIsAI/billm-llama-7b-conll03-ner

Adapter
(125)
this model

Dataset used to train WhereIsAI/billm-llama-7b-conll03-ner