view article Article πΊπ¦ββ¬ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark By wolfram β’ 1 day ago β’ 24
view post Post 3728 supercharge your LLM apps with smolagents π₯however cool your LLM is, without being agentic it can only go so farenter smolagents: a new agent library by Hugging Face to make the LLM write code, do analysis and automate boring stuff!Here's our blog for you to get started https://huggingface.co/blog/smolagents See translation π₯ 13 13 π 5 5 β€οΈ 5 5 + Reply
view post Post 2961 The deepseek-ai/DeepSeek-V3 is very good! I have been playing with it and found it is really good at one-shotting a pretty good landing page.You can play with it here: https://deepseek-artifacts.vercel.appAll the responses get saved in the cfahlgren1/react-code-instructions dataset. Hopefully we can build one of the biggest, highest quality frontend datasets on the hub πͺ See translation π 10 10 π 7 7 + Reply
view post Post 2023 Check out the early preview of the upcoming Tachibana-QVQ dataset: code-reasoning and code-instruct data generated with Qwen/QVQ-72B-PreviewLink here: sequelbox/Tachibana-QVQ-PREVIEWmore to come :) See translation 1 reply Β· π 5 5 π 3 3 + Reply
PowerInfer/SmallThinker-3B-Preview Text Generation β’ Updated about 12 hours ago β’ 1.44k β’ β’ 196
How Well Do LLMs Generate Code for Different Application Domains? Benchmark and Evaluation Paper β’ 2412.18573 β’ Published 10 days ago β’ 1