File size: 1,336 Bytes
c653f2b
 
 
 
 
 
 
51ffb0e
c653f2b
 
ce83ed4
 
 
 
 
3a11f84
ce83ed4
 
3a11f84
ce83ed4
 
 
 
 
d2e168a
 
 
 
 
 
 
 
 
ce83ed4
51ffb0e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
title: README
emoji: ๐Ÿƒ
colorFrom: gray
colorTo: purple
sdk: static
pinned: false
license: mit
---

# Model Description
BioDistilBERT-uncased is the result of training the [DistilBERT-uncased](https://huggingface.co/distilbert-base-uncased?text=The+goal+of+life+is+%5BMASK%5D.) model in a continual learning fashion for 200k training steps using a total batch size of 192 on the PubMed dataset. 


# Initialisation
We initialise our model with the pre-trained checkpoints of the [DistilBERT-uncased](https://huggingface.co/distilbert-base-uncased?text=The+goal+of+life+is+%5BMASK%5D.) model available on Huggingface.

# Architecture
In this model, the size of the hidden dimension and the embedding layer are both set to 768. The vocabulary size is 30522. The number of transformer layers is 6 and the expansion rate of the feed-forward layer is 4. Overall, this model has around 65 million parameters.

# Citation
If you use this model, please consider citing the following paper:

```bibtex
@article{rohanian2023effectiveness,
  title={On the effectiveness of compact biomedical transformers},
  author={Rohanian, Omid and Nouriborji, Mohammadmahdi and Kouchaki, Samaneh and Clifton, David A},
  journal={Bioinformatics},
  volume={39},
  number={3},
  pages={btad103},
  year={2023},
  publisher={Oxford University Press}
}
```