movie-roberta-base / README.md
thatdramebaazguy's picture
Create README.md
f462fe3
|
raw
history blame
1.3 kB
metadata
datasets:
  - imdb
  - cornell_movie_dialogue
language:
  - English
thumbnail: null
tags:
  - roberta
  - roberta-base
  - masked-language-modeling
  - masked-lm
license: cc-by-4.0

roberta-base for MLM

model_name = "thatdramebaazguy/movie-roberta-base"
pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="Fill-Mask")

Overview

Language model: roberta-base
Language: English
Downstream-task: Fill-Mask
Training data: imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names
Eval data: imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names
Infrastructure: 4x Tesla v100
Code: See example

Hyperparameters

Num examples = 4767233
Num Epochs = 2
Instantaneous batch size per device = 20
Total train batch size (w. parallel, distributed & accumulation) = 80
Gradient Accumulation steps = 1
Total optimization steps = 119182
eval_loss  = 1.6153
eval_samples = 20573
perplexity = 5.0296
learning_rate=5e-05
n_gpu = 4

Performance

perplexity = 5.0296

Some of my work: