File size: 1,834 Bytes
383c96a 5c0fae0 383c96a 5c0fae0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
---
language: en
tags:
- text-classification
- tensorflow
- roberta
datasets:
- go_emotions
license: mit
---
Connect me on LinkedIn
- [linkedin.com/in/arpanghoshal](https://www.linkedin.com/in/arpanghoshal)
## What is GoEmotions
Dataset labelled 58000 Reddit comments with 28 emotions
- admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise + neutral
## What is RoBERTa
RoBERTa builds on BERT’s language masking strategy and modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. RoBERTa was also trained on an order of magnitude more data than BERT, for a longer amount of time. This allows RoBERTa representations to generalize even better to downstream tasks compared to BERT.
## Hyperparameters
| Parameter | |
| ----------------- | :---: |
| Learning rate | 5e-5 |
| Epochs | 10 |
| Max Seq Length | 50 |
| Batch size | 16 |
| Warmup Proportion | 0.1 |
| Epsilon | 1e-8 |
## Results
Best Result of `Macro F1` - 49.30%
## Usage
```python
from transformers import RobertaTokenizerFast, TFRobertaForSequenceClassification, pipeline
tokenizer = RobertaTokenizerFast.from_pretrained("arpanghoshal/EmoRoBERTa")
model = TFRobertaForSequenceClassification.from_pretrained("arpanghoshal/EmoRoBERTa")
emotion = pipeline('sentiment-analysis',
model='arpanghoshal/EmoRoBERTa')
emotion_labels = emotion("Thanks for using it.")
print(emotion_labels)
```
Output
```
[{'label': 'gratitude', 'score': 0.9964383244514465}]
```
|