Yi Cui PRO

onekq

AI & ML interests

Benchmark, Code Generation Model

Recent Activity

Articles

Organizations

MLX Community's profile picture ONEKQ AI's profile picture

onekq's activity

posted an update 4 days ago
view post
Post
2965
πŸ‹ DeepSeek πŸ‹v3 achieves a solid 7 point jump than v2.5, surpassing GPT-4o, but is still behind πŸ“ o1 πŸ“and Claude 3.5.

onekq-ai/WebApp1K-models-leaderboard
posted an update 2 months ago
view post
Post
584
October version of Claude 3.5 lifts SOTA (set by its June version) by 7 points.
onekq-ai/WebApp1K-models-leaderboard

Closed sourced models are widening the gap again.

Note: Our frontier leaderboard now uses double test scenarios because the single-scenario test suit has been saturated.
posted an update 2 months ago
view post
Post
1850
I'm now working on finetuning of coding models. If you are GPU-hungry like me, you will find quantized models very helpful. But quantization for finetuning and inference are different and incompatible. So I made two collections here.

Inference (GGUF, via Ollama, CPU is enough)
onekq-ai/ollama-ready-coding-models-67118c3cfa1af2cf04a926d6

Finetuning (Bitsandbytes, QLora, GPU is needed)
onekq-ai/qlora-ready-coding-models-67118771ce001b8f4cf946b2

For quantization, the inference models are far more popular on HF than finetuning models. I use https://huggingface.co/QuantFactory to generate inference models (GGUF), and there are a few other choices.

But there hasn't been such a service for finetuning models. DIY isn't too hard though. I made a few myself and you can find the script in the model cards. If the original model is small enough, you can even do it on a free T4 (available via Google Colab).

If you know a (small) coding model worthy of quantization, please let me know and I'd love to add it to the collections.