File size: 1,130 Bytes
3ba8afb 7e514c1 3ba8afb 3d43a5f c3aa13d 6ad6969 3ba8afb 87b9a00 d4fb737 f6aef03 d3839e9 6591fb8 97edb38 6591fb8 cb41fcd 6591fb8 cb41fcd 6591fb8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
---
language: "en"
tags:
- buy-intent
- sell-intent
- consumer-intent
widget:
- text: "Flutoprazepam (Restas) is a drug which is a benzodiazepine. It was patented in Japan by Sumitomo."
---
# Chemical vs Pharmaceutical Domain Document Classifier
Chemical domain language model finetuned on 13K Chemical, and 14K Pharma Wikipedia articles broken into paragraphs.
| Train Loss | Validation Acc. | Test Acc.|
| ------------- |:-------------: | -----: |
| 0.17 | 0.928 | 0.927 |
# Dataset
Dataset with splits can be found @ [https://www.kaggle.com/shahrukhkhan/pharma-vs-chemicals-domain-classification](https://www.kaggle.com/shahrukhkhan/pharma-vs-chemicals-domain-classification)
# Label Mappings
LABEL_0 => **"PHARMACEUTICAL"** <br/>
LABEL_1 => **"CHEMICAL"**
## Usage in Transformers
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("recobo/chemical-bert-uncased-pharmaceutical-chemical-classifier")
model = AutoModelForSequenceClassification.from_pretrained("recobo/chemical-bert-uncased-pharmaceutical-chemical-classifier")
``` |