HuggingSara
/

model_soups

Image Classification

Model card Files Files and versions Community

HuggingSara commited on Mar 1, 2023

Commit

016cc32

·

1 Parent(s): 0f4b1a2

Create README.md

Files changed (1) hide show

README.md +37 -0

README.md ADDED Viewed

	@@ -0,0 +1,37 @@

+---
+datasets:
+- competitions/aiornot
+language:
+- en
+metrics:
+- accuracy
+pipeline_tag: image-classification
+---
+# Model Card for Model Soups on AirorNot Dataset
+## Model Details
+### Model Description
+Code implementation of the paper [Model soups: averaging weights of multiple fine-tuned models
+improves accuracy without increasing inference time](https://arxiv.org/abs/2203.05482).
+In recent years, finetuning large models has been proving to be an excellent strategy to achieve high-performances in downstream tasks.
+The conventional recipe to do so, it's to fine-tune models with different hyperparameters and select the one achieving the highest accuracy. However Wortsman *et. al* proved that averaging the weights of multiple models finetuned with different hyperparameter configurations can actually inprove accuracy and robustness.
+I read this paper recently and I felt intrigued by the powerful yet simple idea (achieving a SOTA result on Imagenet of s 90.94%) so I decided that this could be an opportunity to get my hands dirty and dive into the code and...try the soup!
+I started by using the official [code implementation](https://github.com/mlfoundations/model-soups) with CLIP ViT-B/32 and finetuned only 5 of their models on AiorNot. I used a simple strategy with minimal modifications. Mainly, I finetuned the models for 8 epochs with a batch size of 56 samples.
+The tricky part was that I had to modify the baseline to use it with our custom dataset.
+To implement this notebook I modified the version by [Cade Gordon](https://cadegordon.io/).
+- **Developed by:** HuggingSara
+- **Model type:** Computer Vision
+- **Language :** Python