--- license: mit library_name: pytorch tags: - Medical Vsion-Language Pre-Training - BenchX --- # MGCA-ResNet50 Checkpoint Model Card A retrained MGCA-ResNet50 model for benchmarking medical vision-language pre-training methods within the BenchX framework. ## Model Details - **Model Type**: MGCA-ResNet50 - **Architecture**: ResNet-50 image encoder and BioClinicalBERT text encoder - **Original Papers**: [Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning](https://arxiv.org/abs/2210.06044) - **Benchmark Paper**: [BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays](https://arxiv.org/abs/2410.21969) - **Benchmark Framework**: https://github.com/yangzhou12/BenchX ## Intended Use - **Primary Use Cases**: - Benchmarking performance for Medical Image Classification - Benchmarking performance for Medical Image Segmentation - Benchmarking performance for Medical Report Generation ## Pre-Training Data - **Dataset**: - Data source(s): MIMIC-CXR - Types of medical images: Frontal chest X-rays - Text data type: Associated radiology reports ## Prerequisites Please follow the [instruction](https://github.com/yangzhou12/BenchX/blob/release/README.md#installation) to install BenchX. ## Training & Evaluation ### 1. Classification To fine-tune MGCA-ResNet50 for classification, run this command: ``` python bin/train.py config/classification//mgca_resnet50.yml ``` ### 2. Segmentation To fine-tune MGCA-ResNet50 for segmentation, run this command: ``` python mmsegmentation/tools/train.py config/benchmark//mgca_resnet50.yml ``` ### 3. Report Generation To fine-tune MGCA-ResNet50 for report generation, run this command: ``` python bin/train.py config/report_generation//mgca_resnet50.yml ``` ### 4. Evaluation To evaluate fine-tuned MGCA-ResNet50 models, run: ``` # For classification and report generation python bin/test.py config///mgca_resnet50.yml validator.splits=[test] ckpt_dir= # For segmentation python mmsegmentation/tools/my_test.py mmsegmentation/config//mgca_resnet50.yml ``` ## Citations ```bibtex @article{wang2022multi, title={Multi-granularity cross-modal alignment for generalized medical visual representation learning}, author={Wang, Fuying and Zhou, Yuyin and Wang, Shujun and Vardhanabhuti, Varut and Yu, Lequan}, journal={Advances in NeurIPS}, volume={35}, pages={33536--33549}, year={2022} } ``` ```bibtex @inproceedings{zhou2024benchx, title={BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays}, author={Yang Zhou, Tan Li Hui Faith, Yanyu Xu, Sicong Leng, Xinxing Xu, Yong Liu, Rick Siow Mong Goh}, booktitle={Proceedings of NeurIPS}, year={2024} } ```