Mark Collier

sparkycollier

AI & ML interests

Open Source AI for fun and profit. Open Source Infrastructure Software for AI & ML.

Recent Activity

liked a model about 16 hours ago
deepseek-ai/DeepSeek-R1
new activity 26 days ago
deepseek-ai/DeepSeek-V3-Base:License
liked a model 27 days ago
deepseek-ai/DeepSeek-V3-Base
View all activity

Organizations

MLX Community's profile picture

sparkycollier's activity

New activity in deepseek-ai/DeepSeek-V3-Base 26 days ago

License

4
#2 opened 27 days ago by
mrfakename
upvoted an article 6 months ago
view article
Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

226
New activity in meta-llama/Meta-Llama-3-8B 9 months ago

License

9
#3 opened 9 months ago by
mrfakename
reacted to WizardLM's post with 🤗 9 months ago
view post
Post
40019
🔥🔥🔥 Introducing WizardLM-2!

📙Release Blog: https://wizardlm.github.io/WizardLM2
✅Model Weights: microsoft/wizardlm-661d403f71e6c8257dbd598a
🐦Twitter: https://twitter.com/WizardLM_AI/status/1779899325868589372

We introduce and opensource WizardLM-2, our next generation state-of-the-art large language models, which have improved performance on complex chat, multilingual, reasoning and agent. New family includes three cutting-edge models: WizardLM-2 8x22B, WizardLM-2 70B, and WizardLM-2 7B.

WizardLM-2 8x22B is our most advanced model, and the best opensource LLM in our internal evaluation on highly complex tasks. WizardLM-2 70B reaches top-tier reasoning capabilities and is the first choice in the same size. WizardLM-2 7B is the fastest and achieves comparable performance with existing 10x larger opensource leading models.

🤗 WizardLM 2 Capacities:

1. MT-Bench (Figure-1)
The WizardLM-2 8x22B even demonstrates highly competitive performance compared to the most advanced proprietary works such as GPT-4-Trubo and Glaude-3. Meanwhile, WizardLM-2 7B and WizardLM-2 70B are all the top-performing models among the other leading baselines at 7B to 70B model scales.

2. Human Preferences Evaluation (Figure 2)
Through this human preferences evaluation, WizardLM-2's capabilities are very close to the cutting-edge proprietary models such as GPT-4-1106-preview, and significantly ahead of all the other open source models.

🔍Method Overview:
As the natural world's human-generated data becomes increasingly exhausted through LLM training, we believe that: the data carefully created by AI and the model step-by-step supervised by AI will be the sole path towards more powerful AI.

In the past one year, we built a fully AI powered synthetic training system. (As shown in the Figure 3).
·
reacted to clem's post with ❤️ 12 months ago