--- license: mit datasets: - cifar10 language: - en pipeline_tag: image-classification --- # BERT Based Image Classifier This model takes inputs from CIFAR10 dataset, convert them into patches embeddings, with positional information along with Class Token to Transformer, the first representation of last hidden state is used to input of the MLP head which is a classifier. A full complete architect has been given for your understanding, which shows the dimensions and different operations that occur. BERT model consists of multiple hidden layers (encoder blocks) which are used. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6319030647a84df2a5dd106c/wCHOhmRD0URoTWKEhUvcJ.png) ## Model Details ### Model Description For greator understanding of how such transformer can be used instead of Convolutions or RNNs in order to classify images, by obtaining a useful representation similar to CNN convolutions and the feature maps produced by them alternative methods. - **Developed by:** Michael Peres - **Model type:** BERT + MLP Classifier Head - **Language(s) (NLP):** Michael ### Model Sources - **Paper:** https://arxiv.org/abs/1810.04805 ## Uses Classifying images based on CIFAR10 dataset Achieved model accuracy of 80%. ## How to Get Started with the Model Run the model defined in the python script file. ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** NVIDIA A100 80GB PCIe - **Hours used:** 0.5hrs ## Model Card Contact - michaelperes1@gmail.com - ec20433@qmul.ac.uk