--- license: apache-2.0 datasets: - bharat-raghunathan/indian-foods-dataset metrics: - accuracy - precision - recall --- # Indian Food Classification with Vision Transformer (ViT) ## Overview This model is a fine-tuned Vision Transformer (ViT) for the task of classifying images of Indian foods. The model was trained on the [Indian Foods Dataset](https://huggingface.co/datasets/bharat-raghunathan/indian-foods-dataset) from Hugging Face Datasets. ## Dataset The Indian Foods Dataset contains 4,770 images across 15 different classes of popular Indian dishes. The dataset is split into: - Training: 3,047 images - Validation: 762 images - Testing: 961 images ## Model The base model used is the vision transformer (google/vit-base-patch16-224-in21k). The model was fine-tuned on the Indian Foods Dataset for 10 epochs using the AdamW optimizer with a learning rate of 2e-4. ## Evaluation The model was evaluated on the test set and achieved the following metrics: - Accuracy: 0.9667 - Precision: 0.9670 - Recall: 0.9667 ## Usage You can use this pre-trained model directly from Hugging Face