DeBERTinha XSmall (aka "debertinha-ptbr-xsmall")

NOTE

We have received feedback of people getting poor results on unbalanced datasets. A more robust training script, like scaling the loss and adding weight decay (1e-3 to 1e-5) seems to fix it.

Please refer to this notebook to check how performance on unbalanced datasets can be improved.

If you have any problems using the model, please contact us.

Thanks!

Introduction

DeBERTinha is a pretrained DeBERTa model for Brazilian Portuguese.

Available models

Model Arch. #Params
sagui-nlp/debertinha-ptbr-xsmall DeBERTa-V3-Xsmall 40M

Usage

from transformers import AutoTokenizer
from transformers import AutoModelForPreTraining
from transformers import AutoModel

model = AutoModelForPreTraining.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
tokenizer = AutoTokenizer.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')

For embeddings

import torch

model = AutoModel.from_pretrained('sagui-nlp/debertinha-ptbr-xsmall')
input_ids = tokenizer.encode('Tinha uma pedra no meio do caminho.', return_tensors='pt')

with torch.no_grad():
    outs = model(input_ids)
    encoded = outs.last_hidden_state[0, 0]  # Take [CLS] special token representation

Citation

If you use our work, please cite:

@misc{campiotti2023debertinha,
      title={DeBERTinha: A Multistep Approach to Adapt DebertaV3 XSmall for Brazilian Portuguese Natural Language Processing Task}, 
      author={Israel Campiotti and Matheus Rodrigues and Yuri Albuquerque and Rafael Azevedo and Alyson Andrade},
      year={2023},
      eprint={2309.16844},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
70
Safetensors
Model size
40.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train sagui-nlp/debertinha-ptbr-xsmall

Collection including sagui-nlp/debertinha-ptbr-xsmall