vit-emotion-classification

This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the FastJobs/Visual_Emotional_Analysis dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3802
  • Accuracy: 0.6125

Intended uses & limitations

Intended Uses

  • Emotion classification from visual inputs (images).

Limitations

  • May reflect biases from the training dataset.
  • Performance may degrade in domains outside the training data.
  • Not suitable for critical or sensitive decision-making tasks.

Training and evaluation data

This model was trained on the FastJobs/Visual_Emotional_Analysis dataset.

The dataset contains:

  • 800 images annotated with 8 emotion labels:
    • Anger
    • Contempt
    • Disgust
    • Fear
    • Happy
    • Neutral
    • Sad
    • Surprise

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.8454 2.5 100 1.4373 0.4813
0.2022 5.0 200 1.4067 0.55
0.0474 7.5 300 1.3802 0.6125
0.0368 10.0 400 1.4388 0.5938

How to use this model

from transformers import AutoImageProcessor, ViTForImageClassification
import torch
from PIL import Image
import requests

from huggingface_hub import login
login(api_key)

image = Image.open("image.jpg").convert("RGB")

image_processor = AutoImageProcessor.from_pretrained("digo-prayudha/vit-emotion-classification")
model = ViTForImageClassification.from_pretrained("digo-prayudha/vit-emotion-classification")

inputs = image_processor(image, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

predicted_label = logits.argmax(-1).item()
print(model.config.id2label[predicted_label])

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
385
Safetensors
Model size
85.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for digo-prayudha/vit-emotion-classification

Finetuned
(1817)
this model

Dataset used to train digo-prayudha/vit-emotion-classification

Evaluation results