metadata
license: mit
datasets:
- brwac
- carolina-c4ai/corpus-carolina
language:
- pt
DeBERTinha XSmall (aka "debertinha-ptbr-xsmall")
NOTE
We have received feedback of people getting poor results on unbalanced datasets. A more robust training script, like scaling the loss and adding weight decay (1e-3 to 1e-5) seems to fix it.
Please refer to this notebook to check how performance on unbalanced datasets can be improved.
If you have any problems using the model, please contact us.
Thanks!
Introduction
DeBERTinha is a pretrained DeBERTa model for Brazilian Portuguese.
Available models
Model | Arch. | #Params |
---|---|---|
sagui-nlp/debertinha-ptbr-xsmall |
DeBERTa-V3-Xsmall | 40M |
Usage
from transformers import AutoTokenizer
from transformers import AutoModelForPreTraining
from transformers import AutoModel
model = AutoModelForPreTraining.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
tokenizer = AutoTokenizer.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
For embeddings
import torch
model = AutoModel.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
input_ids = tokenizer.encode('Tinha uma pedra no meio do caminho.', return_tensors='pt')
with torch.no_grad():
outs = model(input_ids)
encoded = outs.last_hidden_state[0, 0] # Take [CLS] special token representation
Citation
If you use our work, please cite:
@misc{campiotti2023debertinha,
title={DeBERTinha: A Multistep Approach to Adapt DebertaV3 XSmall for Brazilian Portuguese Natural Language Processing Task},
author={Israel Campiotti and Matheus Rodrigues and Yuri Albuquerque and Rafael Azevedo and Alyson Andrade},
year={2023},
eprint={2309.16844},
archivePrefix={arXiv},
primaryClass={cs.CL}
}