Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
shail-2512
's Collections
MultiModal (Any-to-Any)
ALMs (Audio Language Models)
LLMs
TTS
Coder
Reasoning (LRMs)
Image Generation
VLMs
3D
Video Generation
Speech Recognition
Dataset to fine-tune Embeddings
Reranking Models
Embedding Models
VLMs
updated
Dec 2, 2024
Upvote
-
HuggingFaceTB/SmolVLM-Instruct
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
47.4k
•
330
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
1.22k
•
1.53k
vidore/colsmolvlm-alpha
Updated
Dec 3, 2024
•
8.44k
•
46
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text
•
Updated
Dec 4, 2024
•
2.53M
•
•
1.25k
Qwen/Qwen2-VL-7B-Instruct
Image-Text-to-Text
•
Updated
11 days ago
•
1.72M
•
•
1.08k
mistral-community/pixtral-12b
Image-Text-to-Text
•
Updated
1 day ago
•
34k
•
84
HuggingFaceM4/Idefics3-8B-Llama3
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
25.4k
•
261
allenai/Molmo-7B-O-0924
Image-Text-to-Text
•
Updated
Nov 15, 2024
•
5.86k
•
151
Upvote
-
Share collection
View history
Collection guide
Browse collections