78 8 134

t.d.a.g. PRO

sequelbox

sequelbox.bsky.social

AI & ML interests

open source, infinite games. (they/them)

Recent Activity

liked a model about 4 hours ago

cognitivecomputations/Dolphin3.0-Llama3.1-8B

reacted to DawnC's post with ❤️ 1 day ago

🌟 PawMatchAI: Making Breed Selection More Intuitive! 🐕 Excited to share the latest update to this AI-powered companion for finding your perfect furry friend! I've made significant architectural improvements to enhance breed recognition accuracy and feature detection. ✨ What's New? Enhanced breed recognition through advanced morphological feature analysis: - Implemented a sophisticated feature extraction system that analyzes specific characteristics like body proportions, head features, tail structure, fur texture, and color patterns - Added an intelligent attention mechanism that dynamically focuses on the most relevant features for each image - Improved multi-dog detection capabilities through enhanced spatial feature analysis - Achieved better precision in distinguishing subtle breed characteristics 🎯 Key Features: Smart breed recognition powered by advanced AI architecture Visual matching scores with intuitive color indicators Detailed breed comparisons with interactive tooltips Lifestyle-based recommendations tailored to your needs 💭 Project Vision Combining my passion for AI and pets, this project represents another step toward creating meaningful AI applications. Each update aims to make the breed selection process more accessible while improving the underlying technology. 👉 Try it now: https://huggingface.co/spaces/DawnC/PawMatchAI Your likes ❤️ on this space fuel this project's growth! #AI #MachineLearning #DeepLearning #Pytorch #ComputerVision #TechForLife

upvoted a paper 2 days ago

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

View all activity

Organizations

sequelbox's activity

liked a model about 4 hours ago

cognitivecomputations/Dolphin3.0-Llama3.1-8B

Updated about 12 hours ago • 85 • 38

reacted to DawnC's post with ❤️ 1 day ago

Post

2029

🌟 PawMatchAI: Making Breed Selection More Intuitive! 🐕
Excited to share the latest update to this AI-powered companion for finding your perfect furry friend! I've made significant architectural improvements to enhance breed recognition accuracy and feature detection.

✨ What's New?
Enhanced breed recognition through advanced morphological feature analysis:
- Implemented a sophisticated feature extraction system that analyzes specific characteristics like body proportions, head features, tail structure, fur texture, and color patterns
- Added an intelligent attention mechanism that dynamically focuses on the most relevant features for each image
- Improved multi-dog detection capabilities through enhanced spatial feature analysis
- Achieved better precision in distinguishing subtle breed characteristics

🎯 Key Features:
Smart breed recognition powered by advanced AI architecture
Visual matching scores with intuitive color indicators
Detailed breed comparisons with interactive tooltips
Lifestyle-based recommendations tailored to your needs

💭 Project Vision
Combining my passion for AI and pets, this project represents another step toward creating meaningful AI applications. Each update aims to make the breed selection process more accessible while improving the underlying technology.

👉 Try it now: DawnC/PawMatchAI

Your likes ❤️ on this space fuel this project's growth!

#AI #MachineLearning #DeepLearning #Pytorch #ComputerVision #TechForLife

2 replies

upvoted a paper 2 days ago

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published 4 days ago • 40

liked a dataset 5 days ago

theprint/CleverBoi-Data-20k

Viewer • Updated Sep 16, 2024 • 20k • 41 • 4

liked a model 5 days ago

theprint/CleverBoi-Nemo-12B-v2

Text Generation • Updated Nov 3, 2024 • 373 • 4

New activity in sequelbox/Tachibana-QVQ-PREVIEW 6 days ago

[bot] Conversion to Parquet

#1 opened 6 days ago by

parquet-converter

posted an update 6 days ago

Post

2067

Check out the early preview of the upcoming Tachibana-QVQ dataset: code-reasoning and code-instruct data generated with Qwen/QVQ-72B-Preview

Link here: sequelbox/Tachibana-QVQ-PREVIEW

more to come :)

1 reply

updated a dataset 6 days ago

sequelbox/Tachibana-QVQ-PREVIEW

Viewer • Updated 6 days ago • 9.31k • 15 • 4

liked a model 7 days ago

Qwen/Qwen2.5-7B-Instruct

Text Generation • Updated Sep 25, 2024 • 1.56M • 389

liked a model 10 days ago

Qwen/QVQ-72B-Preview

Image-Text-to-Text • Updated 12 days ago • 63.8k • 456

liked 4 datasets 13 days ago

liked a model 13 days ago

answerdotai/ModernBERT-large

Fill-Mask • Updated 11 days ago • 28.1k • 294

upvoted a collection 13 days ago

Llama-3.1-Nemotron-70B

Collection

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated about 10 hours ago • 149

liked a model 13 days ago

nvidia/Llama-3_1-Nemotron-51B-Instruct

Text Generation • Updated Oct 13, 2024 • 101k • 202

reacted to m-ric's post with 👀 16 days ago

Post

2349

𝐇𝐮𝐠𝐠𝐢𝐧𝐠 𝐅𝐚𝐜𝐞 𝐫𝐞𝐥𝐞𝐚𝐬𝐞𝐬 𝐏𝐢𝐜𝐨𝐭𝐫𝐨𝐧, 𝐚 𝐦𝐢𝐜𝐫𝐨𝐬𝐜𝐨𝐩𝐢𝐜 𝐥𝐢𝐛 𝐭𝐡𝐚𝐭 𝐬𝐨𝐥𝐯𝐞𝐬 𝐋𝐋𝐌 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝟒𝐃 𝐩𝐚𝐫𝐚𝐥𝐥𝐞𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧 🥳

🕰️ Llama-3.1-405B took 39 million GPU-hours to train, i.e. about 4.5 thousand years.

👴🏻 If they had needed all this time, we would have GPU stories from the time of Pharaoh 𓂀: "Alas, Lord of Two Lands, the shipment of counting-stones arriving from Cathay was lost to pirates, this shall delay the building of your computing temple by many moons "

🛠️ But instead, they just parallelized the training on 24k H100s, which made it take just a few months.
This required parallelizing across 4 dimensions: data, tensor, context, pipeline.
And it is infamously hard to do, making for bloated code repos that hold together only by magic.

🤏 𝗕𝘂𝘁 𝗻𝗼𝘄 𝘄𝗲 𝗱𝗼𝗻'𝘁 𝗻𝗲𝗲𝗱 𝗵𝘂𝗴𝗲 𝗿𝗲𝗽𝗼𝘀 𝗮𝗻𝘆𝗺𝗼𝗿𝗲! Instead of building mega-training codes, Hugging Face colleagues cooked in the other direction, towards tiny 4D parallelism libs. A team has built Nanotron, already widely used in industry.
And now a team releases Picotron, a radical approach to code 4D Parallelism in just a few hundred lines of code, a real engineering prowess, making it much easier to understand what's actually happening!

⚡ 𝗜𝘁'𝘀 𝘁𝗶𝗻𝘆, 𝘆𝗲𝘁 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹:
Counting in MFU (Model FLOPs Utilization, how much the model actually uses all the compute potential), this lib reaches ~50% on SmolLM-1.7B model with 8 H100 GPUs, which is really close to what huge libs would reach. (Caution: the team is leading further benchmarks to verify this)

Go take a look 👉 https://github.com/huggingface/picotron/tree/main/picotron

1 reply

liked 2 datasets 24 days ago

jondurbin/airoboros-2.2

Viewer • Updated Oct 3, 2023 • 44.8k • 41 • 20

microsoft/orca-math-word-problems-200k

Viewer • Updated Mar 4, 2024 • 200k • 844 • 425