File size: 1,541 Bytes
f462fe3 cc2c908 f462fe3 cc2c908 f462fe3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
datasets:
- imdb
- cornell_movie_dialogue
- polarity_movie_data
- 25mlens_movie_data
language:
- English
thumbnail:
tags:
- roberta
- roberta-base
- masked-language-modeling
- masked-lm
license: cc-by-4.0
---
# roberta-base for MLM
Objective: To make a Roberta Base for the Movie Domain by using various Movie Datasets as simple text for Masked Language Modeling.
This is the Movie Roberta to be used in Movie Domain applications.
```
model_name = "thatdramebaazguy/movie-roberta-base"
pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="Fill-Mask")
```
## Overview
**Language model:** roberta-base
**Language:** English
**Downstream-task:** Fill-Mask
**Training data:** imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names
**Eval data:** imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names
**Infrastructure**: 4x Tesla v100
**Code:** See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/scripts/shell_scripts/train_movie_roberta.sh)
## Hyperparameters
```
Num examples = 4767233
Num Epochs = 2
Instantaneous batch size per device = 20
Total train batch size (w. parallel, distributed & accumulation) = 80
Gradient Accumulation steps = 1
Total optimization steps = 119182
eval_loss = 1.6153
eval_samples = 20573
perplexity = 5.0296
learning_rate=5e-05
n_gpu = 4
```
## Performance
perplexity = 5.0296
Some of my work:
- [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/)
---
|