metadata

license: mit
datasets:
  - brwac
  - carolina-c4ai/corpus-carolina
language:
  - pt

DeBERTinha XSmall (aka "debertinha-ptbr-xsmall")

NOTE

We have received feedback of people getting poor results on unbalanced datasets. A more robust training script, like scaling the loss and adding weight decay (1e-3 to 1e-5) seems to fix it.

Please refer to this notebook to check how performance on unbalanced datasets can be improved.

If you have any problems using the model, please contact us.

Thanks!

Introduction

DeBERTinha is a pretrained DeBERTa model for Brazilian Portuguese.

Available models

Model	Arch.	#Params
`sagui-nlp/debertinha-ptbr-xsmall`	DeBERTa-V3-Xsmall	40M

Usage

from transformers import AutoTokenizer
from transformers import AutoModelForPreTraining
from transformers import AutoModel

model = AutoModelForPreTraining.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
tokenizer = AutoTokenizer.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')

For embeddings

import torch

model = AutoModel.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
input_ids = tokenizer.encode('Tinha uma pedra no meio do caminho.', return_tensors='pt')

with torch.no_grad():
    outs = model(input_ids)
    encoded = outs.last_hidden_state[0, 0]  # Take [CLS] special token representation

Citation

If you use our work, please cite:

@misc{campiotti2023debertinha,
      title={DeBERTinha: A Multistep Approach to Adapt DebertaV3 XSmall for Brazilian Portuguese Natural Language Processing Task}, 
      author={Israel Campiotti and Matheus Rodrigues and Yuri Albuquerque and Rafael Azevedo and Alyson Andrade},
      year={2023},
      eprint={2309.16844},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}