File size: 2,498 Bytes
8569700
 
 
 
 
 
 
 
28a9708
62c409a
 
8569700
 
 
 
 
 
 
d2f8eb8
 
 
 
8569700
 
 
 
 
 
d2f8eb8
8569700
d2f8eb8
 
8569700
 
d2f8eb8
 
 
8569700
d2f8eb8
 
 
 
 
 
 
8569700
 
 
 
d2f8eb8
 
8569700
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
language: "en"
tags:
- bert
- medical
- clinical
- assertion
- negation
- text-classification
widget:
- text: "Patient denies [entity] SOB [entity]."

---

# Clinical Assertion / Negation Classification BERT

## Model description

The Clinical Assertion and Negation Classification BERT is introduced in the paper [Assertion Detection in Clinical Notes: Medical Language Models to the Rescue?
](https://aclanthology.org/2021.nlpmc-1.5/). The model helps structure information in clinical patient letters by classifying medical conditions mentioned in the letter into PRESENT, ABSENT and POSSIBLE.

The model is based on the [ClinicalBERT - Bio + Discharge Summary BERT Model](https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT) by Alsentzer et al. and fine-tuned on assertion data from the [2010 i2b2 challenge](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168320/).


#### How to use the model

You can load the model via the transformers library:
```
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("bvanaken/clinical-assertion-negation-bert")
model = AutoModelForSequenceClassification.from_pretrained("bvanaken/clinical-assertion-negation-bert")

```

The model expects input in the form of spans/sentences with one marked entity to classify as `PRESENT(0)`, `ABSENT(1)` or `POSSIBLE(2)`. The entity in question is identified with the special token `[entity]` surrounding it.

Example input and inference:
```
input = "The patient recovered during the night and now denies any [entity] shortness of breath [entity]."

tokenized_input = tokenizer(input, return_tensors="pt")
output = model(**tokenized_input)

import numpy as np
predicted_label = np.argmax(output.logits.detach().numpy())  ## 1 == ABSENT
``` 

### Cite

When working with the model, please cite our paper as follows:

```bibtex
@inproceedings{van-aken-2021-assertion,
    title = "Assertion Detection in Clinical Notes: Medical Language Models to the Rescue?",
    author = "van Aken, Betty  and
      Trajanovska, Ivana  and
      Siu, Amy  and
      Mayrdorfer, Manuel  and
      Budde, Klemens  and
      Loeser, Alexander",
    booktitle = "Proceedings of the Second Workshop on Natural Language Processing for Medical Conversations",
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.nlpmc-1.5",
    doi = "10.18653/v1/2021.nlpmc-1.5"
}
```