ELECTRA
Introduction
ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU. At large scale, ELECTRA achieves state-of-the-art results on the SQuAD 2.0 dataset.
Electra-base-vn is trained on more 148gb text with max length 512.
You can download tensorflow version at Electra base TF version
Contact information
For personal communication related to this project, please contact Nha Nguyen Van ([email protected]).