15 65 340

alkinun

AtAndDev

AI & ML interests

LLMs, Alignment, Merging, Unsloth, DPO, SFT, ORPO, SPIN..

Recent Activity

liked a model about 16 hours ago

deepseek-ai/DeepSeek-R1

posted an update about 16 hours ago

R1 is out! And with a lot of other R1 releated models...

replied to JingzeShi's post about 17 hours ago

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co/collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

View all activity

Organizations

AtAndDev's activity

liked a model about 16 hours ago

deepseek-ai/DeepSeek-R1

Updated about 2 hours ago • 874

posted an update about 16 hours ago

Post

502

R1 is out! And with a lot of other R1 releated models...

replied to JingzeShi's post about 17 hours ago

Duh its pretraining

upvoted a collection 2 days ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 468

reacted to florentgbelidji's post with 🔥 2 days ago

Post

1201

𝗣𝗹𝗮𝗻𝗻𝗶𝗻𝗴 𝗬𝗼𝘂𝗿 𝗡𝗲𝘅𝘁 𝗦𝗸𝗶 𝗔𝗱𝘃𝗲𝗻𝘁𝘂𝗿𝗲 𝗝𝘂𝘀𝘁 𝗚𝗼𝘁 𝗦𝗺𝗮𝗿𝘁𝗲𝗿: 𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗔𝗹𝗽𝗶𝗻𝗲 𝗔𝗴𝗲𝗻𝘁!🏔️⛷️

With the big hype around AI agents these days, I couldn’t stop thinking about how AI agents could truly enhance real-world activities.
What sort of applications could we build with those AI agents: agentic RAG? self-correcting text-to-sql? Nah, boring…

Passionate about outdoors, I’ve always dreamed of a tool that could simplify planning mountain trips while accounting for all potential risks. That’s why I built 𝗔𝗹𝗽𝗶𝗻𝗲 𝗔𝗴𝗲𝗻𝘁, a smart assistant designed to help you plan safe and enjoyable itineraries in the French Alps and Pyrenees.

Built using Hugging Face's 𝘀𝗺𝗼𝗹𝗮𝗴𝗲𝗻𝘁𝘀 library, Alpine Agent combines the power of AI with trusted resources like 𝘚𝘬𝘪𝘵𝘰𝘶𝘳.𝘧𝘳 (https://skitour.fr/) and METEO FRANCE. Whether it’s suggesting a route with moderate difficulty or analyzing avalanche risks and weather conditions, this agent dynamically integrates data to deliver personalized recommendations.

In my latest blog post, I share how I developed this project—from defining tools and integrating APIs to selecting the best LLMs like 𝘘𝘸𝘦𝘯2.5-𝘊𝘰𝘥𝘦𝘳-32𝘉-𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵, 𝘓𝘭𝘢𝘮𝘢-3.3-70𝘉-𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵, or 𝘎𝘗𝘛-4.

⛷️ Curious how AI can enhance adventure planning? Try the app and share your thoughts: florentgbelidji/alpine-agent

👉 Want to build your own agents? Whether for cooking, sports training, or other passions, the possibilities are endless. Check out the blog post to learn more: https://huggingface.co/blog/florentgbelidji/alpine-agent

Many thanks to @m-ric for helping on building this tool with smolagents!

1 reply

reacted to aiqcamp's post with 🔥 2 days ago

Post

2824

# 🎨 FLUX Diagram Generator - Create Hand-Drawn Style Diagrams

aiqcamp/diagram

Generate beautiful mind maps and diagrams with AI! Using the FLUX.1-schnell model, create natural hand-drawn style diagrams that bring your ideas to life.

## ✨ Key Features

- 💡 Intuitive prompt-based input system
- 🎯 Rich examples including knowledge trees, digital transformation, creative process, and more
- 🛠 Customizable settings for image size, seed values, and more
- 🖼 Support for resolutions up to 2048x2048
- ⚡ Fast generation (4 steps default)

## 🎯 Use Cases

- Educational materials
- Project planning
- Idea structuring
- Presentation visuals
- Business process visualization

Built with Gradio for a user-friendly interface that anyone can use. Start creating your own diagrams now! 🚀

Try it out to transform your ideas into visually appealing diagrams with a unique hand-drawn aesthetic.

#AIart #Diagram #Mindmap #Visualization #HuggingFace

1 reply

reacted to cutechicken's post with 🔥 2 days ago

Post

2700

🔬 PaperImpact
: Scientific Impact Predictor Powered by Deep Learning 🎯

VIDraft/PaperImpact

📚 Overview
A cutting-edge AI system that combines transformer architecture with citation pattern analysis to predict research impact. Our model, trained on 120,000+ CS papers, analyzes innovation potential, methodological robustness, and future impact, providing researchers with valuable insights before publication.
🧠 Scientific Foundation

BERT-based semantic analysis
Citation network pattern learning
NDCG optimization & MSE loss
Cross-validated prediction engine
GPU-accelerated inference

💫 Why Researchers Need This

Pre-submission impact assessment
Research direction optimization
Time-saving paper evaluation
Competitive edge in academia
Trend identification advantage

🎯 Key Features

One-click arXiv paper analysis
Real-time impact scoring (0-1)
9-tier grading system (AAA-C)
Smart input validation
Instant visual feedback

🌟 Unique Benefits
"Don't wait years to know your paper's impact. Get instant, AI-powered insights to strengthen your research strategy and maximize your academic influence."
Perfect for:

Research authors
PhD students
Journal editors
Research institutions
Grant committees

#ResearchImpact #AcademicAI #ScienceMetrics #ResearchExcellence

1 reply

reacted to MonsterMMORPG's post with 🔥 2 days ago

Post

1165

Most Powerful Vision Model CogVLM 2 now works amazing on Windows with new Triton pre-compiled wheels - 19 Examples - Locally tested with 4-bit quantization - Second example is really wild - Can be used for image captioning or any image vision task

The APP and the installers : https://www.patreon.com/posts/120193330

Check below screenshots to see how to use it

Currently the APP works amazing with 4-bit quantization very fast

I am searching to lower VRAM usage even further with like adding CPU-Offloading and other stuff if possible

Previously we were lacking Triton but it now works perfect

My installer installs into a Python 3.10 VENV completely isolated and clean

You can see entire APP and installer source code

If you get Triton error make sure to delete your Triton cache after installing the app like below

C:\Users\Furkan.triton

Hugging Face repo with sample code : THUDM/cogvlm2-llama3-chat-19B

GitHub repo : https://github.com/THUDM/CogVLM2

Triton Windows : https://github.com/woct0rdho/triton-windows/releases

upvoted an article 2 days ago

Article

Welcome Gemma 2 - Google's new open LLM

Jun 27, 2024

• 125

upvoted a paper 3 days ago

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Paper • 2501.09751 • Published 4 days ago • 39

liked a model 3 days ago

Qwen/Qwen2.5-Coder-14B-Instruct

Text Generation • Updated 9 days ago • 20.5k • 72

upvoted a collection 3 days ago

Qwen2.5-Coder

Collection

Code-specific model series based on Qwen2.5 • 40 items • Updated Nov 28, 2024 • 261

upvoted a paper 3 days ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published 8 days ago • 83

commented on Gradio spaces are the perfect agent tools\! 3 days ago

wow, great idea, i remember a similar project called something something agi when connecting tools to llms were the trend.

upvoted an article 3 days ago

Article

Gradio spaces are the perfect agent tools\!

•

4 days ago

• 12

reacted to merve's post with 🔥🤗❤️ 3 days ago

Post

2081

Everything that happened this week in open AI, a recap 🤠 merve/jan-17-releases-678a673a9de4a4675f215bf5

👀 Multimodal
- MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB
(vision, speech and text!)
- VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448
- ByteDance released larger SA2VA that comes in 26B parameters
- Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance

💬 LLMs
- MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens 🤯
- Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B
- kyutai released Helium-1-Preview-2B is a new small multilingual LM
- Wayfarer-12B is a new LLM able to write D&D 🧙🏻‍♂️
- ReaderLM-v2 is a new HTML parsing model by Jina AI

- Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder
- Unsloth released Phi-4, faster and memory efficient Llama 3.3

🖼️ Vision
- MatchAnything is a new foundation model for matching
- FitDit is a high-fidelity VTON model based on DiT architecture

🗣️ Audio
- OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities

📖 Retrieval
- lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages
- cde-small-v2 is a new sota small retrieval model by
@jxm

upvoted 2 papers 4 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 6 days ago • 263

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 49