Need4Speed

company
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

need-for-speed's activity

wenhuach 
posted an update 12 days ago
wenhuach 
posted an update 18 days ago
wenhuach 
posted an update 29 days ago
view post
Post
335
This week, OPEA Space released several new INT4 models, including:
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
allenai/OLMo-2-1124-13B-Instruct
THUDM/glm-4v-9b
AIDC-AI/Marco-o1
and several others.
Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen!

https://huggingface.co/OPEA
  • 3 replies
·
wenhuach 
posted an update about 1 month ago
view post
Post
981
OPEA space just releases nearly 20 int4 models, for example, QWQ-32B-Preview,
Llama-3.2-11B-Vision-Instruct, Qwen2.5, Llama3.1, etc. Check out https://huggingface.co/OPEA
loubnabnl 
posted an update about 1 month ago
view post
Post
1733
Making SmolLM2 reproducible: open-sourcing our training & evaluation toolkit 🛠️ https://github.com/huggingface/smollm/

- Pre-training code with nanotron
- Evaluation suite with lighteval
- Synthetic data generation using distilabel (powers our new SFT dataset HuggingFaceTB/smoltalk)
- Post-training scripts with TRL & the alignment handbook
- On-device tools with llama.cpp for summarization, rewriting & agents

Apache 2.0 licensed. V2 pre-training data mix coming soon!

Which other tools should we add next?
wenhuach 
posted an update 5 months ago
view post
Post
650
Try to find a better int4 algorithm for LLAMA3.1? For the 8B model, AutoRound boasts an average improvement across 10 zero-shot tasks, scoring 63.93 versus 63.15 (AWQ). Notably, on the MMLU task, it achieved 66.72 compared to 65.25, and on the ARC-C task, it scored 52.13 against 50.94. For further details and comparisons, visit the leaderboard at Intel/low_bit_open_llm_leaderboard.