|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- bharat-raghunathan/indian-foods-dataset |
|
metrics: |
|
- accuracy |
|
- precision |
|
- recall |
|
--- |
|
|
|
# Indian Food Classification with Vision Transformer (ViT) |
|
|
|
## Overview |
|
This model is a fine-tuned Vision Transformer (ViT) for the task of classifying images of Indian foods. The model was trained on the [Indian Foods Dataset](https://huggingface.co/datasets/bharat-raghunathan/indian-foods-dataset) from Hugging Face Datasets. |
|
|
|
## Dataset |
|
The Indian Foods Dataset contains 4,770 images across 15 different classes of popular Indian dishes. The dataset is split into: |
|
|
|
- Training: 3,047 images |
|
- Validation: 762 images |
|
- Testing: 961 images |
|
|
|
## Model |
|
The base model used is the vision transformer (google/vit-base-patch16-224-in21k). The model was fine-tuned on the Indian Foods Dataset for 10 epochs using the AdamW optimizer with a learning rate of 2e-4. |
|
|
|
## Evaluation |
|
The model was evaluated on the test set and achieved the following metrics: |
|
|
|
- Accuracy: 0.9667 |
|
- Precision: 0.9670 |
|
- Recall: 0.9667 |
|
|
|
## Usage |
|
You can use this pre-trained model directly from Hugging Face |