1 279 414

Habibullah Akbar

ChavyvAkvar

https://chavyv.vercel.app

AI & ML interests

AGI, Ethical-Driven AI, Open-source AI

Recent Activity

upvoted a paper 2 days ago

1.58-bit FLUX

upvoted a paper 2 days ago

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

upvoted a paper 2 days ago

Xmodel-2 Technical Report

View all activity

Organizations

ChavyvAkvar's activity

upvoted 3 papers 2 days ago

upvoted a paper 3 days ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published 19 days ago • 44

upvoted a paper 5 days ago

Looking Inward: Language Models Can Learn About Themselves by Introspection

Paper • 2410.13787 • Published Oct 17, 2024 • 7

upvoted a paper 9 days ago

Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published 10 days ago • 41

upvoted 2 papers 11 days ago

Deliberation in Latent Space via Differentiable Cache Augmentation

Paper • 2412.17747 • Published 11 days ago • 28

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Paper • 2412.14922 • Published 15 days ago • 82

upvoted a paper 14 days ago

LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation

Paper • 2412.15188 • Published 15 days ago • 1

upvoted 4 papers 15 days ago

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Paper • 2412.14475 • Published 16 days ago • 52

Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published 16 days ago • 69

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

Paper • 2412.13795 • Published 17 days ago • 18

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 15 days ago • 334

upvoted 2 papers 16 days ago

Compressed Chain of Thought: Efficient Reasoning Through Dense Representations

Paper • 2412.13171 • Published 17 days ago • 31

Emergence of Abstractions: Concept Encoding and Decoding Mechanism for In-Context Learning in Transformers

Paper • 2412.12276 • Published 18 days ago • 15

upvoted 3 papers 17 days ago

Llama 3 Meets MoE: Efficient Upcycling

Paper • 2412.09952 • Published 22 days ago • 1

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published 22 days ago • 80

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published 17 days ago • 91

upvoted a paper 18 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 21 days ago • 136

upvoted a paper 19 days ago

Normalizing Flows are Capable Generative Models

Paper • 2412.06329 • Published 26 days ago • 8