HuggingSara
/

model_soups

Image Classification

Model card Files Files and versions Community

model_soups / README.md

HuggingSara's picture

Update README.md

40b49a1 almost 2 years ago

|

1.58 kB

	---
	datasets:
	- competitions/aiornot
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: image-classification
	---

	# Model Card for Model Soups on AirorNot Dataset

	## Model Details

	### Model Description

	Code implementation of the paper [Model soups: averaging weights of multiple fine-tuned models
	improves accuracy without increasing inference time](https://arxiv.org/abs/2203.05482).

	In recent years, finetuning large models has been proving to be an excellent strategy to achieve high-performances in downstream tasks.
	The conventional recipe to do so, it's to fine-tune models with different hyperparameters and select the one achieving the highest accuracy. However Wortsman et. al proved that averaging the weights of multiple models finetuned with different hyperparameter configurations can actually inprove accuracy and robustness.

	I read this paper recently and I felt intrigued by the powerful yet simple idea (achieving a SOTA result on Imagenet of s 90.94%) so I decided that this could be an opportunity to get my hands dirty and dive into the code and...try the soup!

	I started by using the official [code implementation](https://github.com/mlfoundations/model-soups) with CLIP ViT-B/32 and finetuned only 5 of their models on AiorNot. I used a simple strategy with minimal modifications. Mainly, I finetuned the models for 8 epochs with a batch size of 56 samples.

	The tricky part was that I had to modify the baseline to use it with our custom dataset.


	- Developed by: HuggingSara
	- Model type: Computer Vision
	- Language : Python