1 2 22

Burning ray

adarksky

aeryskyB

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

deepseek-ai/DeepSeek-R1

updated a model 7 days ago

hexgrad/Kokoro-82M

new activity 7 days ago

hexgrad/Kokoro-82M:Update kokoro.py

View all activity

Organizations

adarksky's activity

liked a model 2 days ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated about 7 hours ago • 20.1k • 1.7k

updated a model 7 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated about 15 hours ago • 30.9k • 2.22k

New activity in hexgrad/Kokoro-82M 7 days ago

Update kokoro.py

#43 opened 7 days ago by

adarksky

liked a model 9 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated about 15 hours ago • 30.9k • 2.22k

liked a model 26 days ago

deepseek-ai/Janus-1.3B

Any-to-Any • Updated Nov 14, 2024 • 12k • 510

reacted to merve's post with 🔥 about 2 months ago

Post

2667

small but mighty 🔥
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM 🫰🏻 also with gradient accumulation simulated batch size is 16 ✨
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work 💝 https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb