Ozan Oktay
commited on
Commit
·
896d61e
1
Parent(s):
6fcad0b
Update Readme -- Add information about BioViL Resnet50.
Browse files
README.md
CHANGED
@@ -27,16 +27,22 @@ First, we pretrain [**CXR-BERT-general**](https://huggingface.co/microsoft/Biome
|
|
27 |
| CXR-BERT-general | [microsoft/BiomedVLP-CXR-BERT-general](https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-general) | PubMed & MIMIC | Pretrained for biomedical literature and clinical domains |
|
28 |
| CXR-BERT-specialized (after multi-modal training) | [microsoft/BiomedVLP-CXR-BERT-specialized](https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized) | PubMed & MIMIC | Pretrained for chest X-ray domain |
|
29 |
|
|
|
|
|
|
|
|
|
30 |
## Citation
|
31 |
|
32 |
-
|
|
|
|
|
33 |
@misc{https://doi.org/10.48550/arxiv.2204.09817,
|
34 |
-
|
|
|
35 |
author = {Boecking, Benedikt and Usuyama, Naoto and Bannur, Shruthi and Castro, Daniel C. and Schwaighofer, Anton and Hyland, Stephanie and Wetscherek, Maria and Naumann, Tristan and Nori, Aditya and Alvarez-Valle, Javier and Poon, Hoifung and Oktay, Ozan},
|
|
|
36 |
publisher = {arXiv},
|
37 |
year = {2022},
|
38 |
-
url = {https://arxiv.org/abs/2204.09817},
|
39 |
-
doi = {10.48550/ARXIV.2204.09817},
|
40 |
}
|
41 |
```
|
42 |
|
@@ -127,9 +133,6 @@ This model was developed using English corpora, and thus can be considered Engli
|
|
127 |
|
128 |
## Further information
|
129 |
|
130 |
-
Please refer to the corresponding paper, [Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing](https://arxiv.org/abs/2204.09817) for additional details on the model training and evaluation.
|
131 |
-
|
132 |
-
For additional inference pipelines with CXR-BERT, please refer to the [HI-ML-Multimodal GitHub](https://hi-ml.readthedocs.io/en/latest/multimodal.html) repository. The associated source files will soon be accessible through this link.
|
133 |
-
|
134 |
-
|
135 |
|
|
|
|
27 |
| CXR-BERT-general | [microsoft/BiomedVLP-CXR-BERT-general](https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-general) | PubMed & MIMIC | Pretrained for biomedical literature and clinical domains |
|
28 |
| CXR-BERT-specialized (after multi-modal training) | [microsoft/BiomedVLP-CXR-BERT-specialized](https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized) | PubMed & MIMIC | Pretrained for chest X-ray domain |
|
29 |
|
30 |
+
## Image model
|
31 |
+
|
32 |
+
**CXR-BERT-specialized** is jointly trained with a ResNet-50 image model in a multi-modal contrastive learning framework. Prior to multi-modal learning, the image model is pre-trained on the same set of images in MIMIC-CXR using [SimCLR](https://arxiv.org/abs/2002.05709). The corresponding model definition and its loading functions can be accessed through our [HI-ML-Multimodal](https://github.com/microsoft/hi-ml/blob/main/hi-ml-multimodal/src/health_multimodal/image/model/model.py) GitHub repository. The joint image and text model, namely [BioViL](https://arxiv.org/abs/2204.09817), can be used in phrase grounding applications as shown in this python notebook [example](https://mybinder.org/v2/gh/microsoft/hi-ml/HEAD?labpath=hi-ml-multimodal%2Fnotebooks%2Fphrase_grounding.ipynb). Additionally, please check the [MS-CXR benchmark](https://physionet.org/content/ms-cxr/0.1/) for a more systematic evaluation of joint image and text models in phrase grounding tasks.
|
33 |
+
|
34 |
## Citation
|
35 |
|
36 |
+
The corresponding manuscript is accepted to be presented at the [**European Conference on Computer Vision (ECCV) 2022.**](https://eccv2022.ecva.net/)
|
37 |
+
|
38 |
+
```bibtex
|
39 |
@misc{https://doi.org/10.48550/arxiv.2204.09817,
|
40 |
+
doi = {10.48550/ARXIV.2204.09817},
|
41 |
+
url = {https://arxiv.org/abs/2204.09817},
|
42 |
author = {Boecking, Benedikt and Usuyama, Naoto and Bannur, Shruthi and Castro, Daniel C. and Schwaighofer, Anton and Hyland, Stephanie and Wetscherek, Maria and Naumann, Tristan and Nori, Aditya and Alvarez-Valle, Javier and Poon, Hoifung and Oktay, Ozan},
|
43 |
+
title = {Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing},
|
44 |
publisher = {arXiv},
|
45 |
year = {2022},
|
|
|
|
|
46 |
}
|
47 |
```
|
48 |
|
|
|
133 |
|
134 |
## Further information
|
135 |
|
136 |
+
Please refer to the corresponding paper, ["Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing", ECCV'22](https://arxiv.org/abs/2204.09817) for additional details on the model training and evaluation.
|
|
|
|
|
|
|
|
|
137 |
|
138 |
+
For additional inference pipelines with CXR-BERT, please refer to the [HI-ML-Multimodal GitHub](https://hi-ml.readthedocs.io/en/latest/multimodal.html) repository.
|