view post Post 3599 The Chinese community is shipping 🚢 DeepSeek V3 (685 B MoE) has quietly released on the hub! Base: deepseek-ai/DeepSeek-V3-BaseInstruct: deepseek-ai/DeepSeek-V3Can’t wait to see what’s next! See translation 1 reply · 🔥 13 13 🚀 7 7 👍 3 3 ❤️ 2 2 🤗 2 2 👀 1 1 + Reply
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 125
view post Post 1798 Welcome back, Small Language Models Enthusiasts and GPU Poor oss enjoyers lets connect. Just created an organization which main target is to have fun with smaller models tuneable on consumer range GPUs, feel free to join and lets have some fun, much love ;3https://huggingface.co/SmolTuners See translation 3 replies · ❤️ 12 12 🤗 5 5 + Reply