--- base_model: - openai/clip-vit-large-patch14 datasets: - tanganke/dtd metrics: - accuracy --- # Model Card ## Model Details - Architecture: ViT-Large with patch size 14 - Training Data: DTD dataset ## Training Details Adam Optimizer with a constant learning rate 1e-5 for 4000 steps training (batch_size=32). Only the vision encoder is fine-tuned. ## Evaluation Results - pre-trained: 0.554787278175354 - fine-tuned: 0.8547872304916382