140 700 576

Adam Molnar

lunarflu

AI & ML interests

join the Hugging Face discord! hf.co/discord/join

Recent Activity

liked a Space 1 day ago

khouraisan/fumo-classifier

upvoted an article 3 days ago

🦸🏻#2: Your Go-To Vocabulary to Navigate the World of AI Agents and Agentic Workflows

upvoted an article 3 days ago

Fine-tune ModernBERT for text classification using synthetic data

View all activity

Organizations

lunarflu's activity

replied to Xenova's post 7 days ago

waiting for moonshine-distilled next :)

reacted to Xenova's post with 🚀🔥❤️ 7 days ago

Post

2690

Introducing Moonshine Web: real-time speech recognition running 100% locally in your browser!
🚀 Faster and more accurate than Whisper
🔒 Privacy-focused (no data leaves your device)
⚡️ WebGPU accelerated (w/ WASM fallback)
🔥 Powered by ONNX Runtime Web and Transformers.js

Demo: webml-community/moonshine-web
Source code: https://github.com/huggingface/transformers.js-examples/tree/main/moonshine-web

5 replies

reacted to ginipick's post with 🔥 9 days ago

Post

4503

🎬 Revolutionize Your Video Creation
Dokdo Multimodal AI Transform a single image into a stunning video with perfect audio harmony! 🚀

Superior Technology 💫
Advanced Flow Matching: Smoother video transitions surpassing Kling and Sora
Intelligent Sound System: Automatically generates perfect audio by analyzing video mood
Multimodal Framework: Advanced AI integrating image, text, and audio analysis
Outstanding Performance 🎯
Ultra-High Resolution: 4K video quality with bfloat16 acceleration
Real-Time Optimization: 3x faster processing with PyTorch GPU acceleration
Smart Sound Matching: Real-time audio effects based on scene transitions and motion
Exceptional Features ✨
Custom Audio Creation: Natural soundtrack matching video tempo and rhythm
Intelligent Watermarking: Adaptive watermark adjusting to video characteristics
Multilingual Support: Precise translation engine powered by Helsinki-NLP
Versatile Applications 🌟
Social Media Marketing: Create engaging shorts for Instagram and YouTube
Product Promotion: Dynamic promotional videos highlighting product features
Educational Content: Interactive learning materials with enhanced engagement
Portfolio Enhancement: Professional-grade videos showcasing your work
Experience the video revolution with Dokdo Multimodal, where anyone can create professional-quality content from a single image. Elevate your content with perfectly synchronized video and audio that captivates your audience! 🎨

Start creating stunning videos that stand out from the crowd - whether you're a marketer, educator, content creator, or business owner. Join the future of AI-powered video creation today!

ginipick/Dokdo-multimodal

#VideoInnovation #AITechnology #PremiumContent #MarketingSolution

🔊 Please turn on your sound for the best viewing experience!

1 reply

reacted to vincentg64's post with 🔥 9 days ago

Post

2194

LLM 2.0, RAG & Non-Standard Gen AI on GitHub https://mltblog.com/3DsyZSq

In this article, I share my latest Gen AI and LLM advances, featuring innovative approaches radically different from both standard AI and classical ML/NLP. The focus is on doing better with less, using efficient architectures, new algorithms and evaluation metrics. It originates from research that I started long ago. It gained significant momentum in the last two years. See background and history at https://mltblog.com/4g2sKTv.

OpenAI, Perplexity, Anthropic, Llama and others typically follow the trend and implement solutions very similar to mines within 3 to 6 months after I publish new milestones. For instance, multi-tokens, knowledge graph tokens, multi-indexes, real-time fine-tuning, mixtures of experts, LLM routers, small enterprise sub-LLMs, prompt distillation, relevancy scoring engine, deep contextual retrieval, optimum agentic chunking, and modern UI instead of the basic prompt box. I keep adding new features all the time, staying ahead of competition.

➡️ Read full article with links to GitHub, at https://mltblog.com/3DsyZSq

1 reply

reacted to merve's post with 👀🚀❤️🔥 10 days ago

Post

2731

Aya by Cohere For AI can now see! 👀

C4AI community has built Maya 8B, a new open-source multilingual VLM built on SigLIP and Aya 8B 🌱 works on 8 languages! 🗣️

The authors extend Llava dataset using Aya's translation capabilities with 558k examples!
ry it here kkr5155/maya_demo

Dataset maya-multimodal/pretrain

Model maya-multimodal/maya 👏
kudos @nahidalam and team

1 reply

replied to nyuuzyou's post 16 days ago

Unrelated but wanted to check re: spam stuff, is this your account btw? (I assumed it was an impersonator/troll but feel free to correct me)

reacted to lewtun's post with 👍🚀👀❤️🔥 18 days ago

Post

6630

We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute 🔥

How? By combining step-wise reward models with tree search algorithms :)

We show that smol models can match or exceed the performance of their much larger siblings when given enough "time to think"

We're open sourcing the full recipe and sharing a detailed blog post.

In our blog post we cover:

📈 Compute-optimal scaling: How we implemented DeepMind's recipe to boost the mathematical capabilities of open models at test-time.

🎄 Diverse Verifier Tree Search (DVTS): An unpublished extension we developed to the verifier-guided tree search technique. This simple yet effective method improves diversity and delivers better performance, particularly at large test-time compute budgets.

🧭 Search and Learn: A lightweight toolkit for implementing search strategies with LLMs and built for speed with vLLM

Here's the links:

- Blog post: HuggingFaceH4/blogpost-scaling-test-time-compute

- Code: https://github.com/huggingface/search-and-learn

Enjoy!

2 replies

reacted to lorraine2's post with 🚀 18 days ago

Post

1987

🦙New NVIDIA paper: LLaMA-Mesh 🦙

We enable large language models to generate and understand 3D meshes by representing them as text and fine-tuning. This unifies the 3D and text modalities in a single model and preserves language abilities, unlocking conversational 3D creation with mesh understanding.

🔎 Project Page: https://research.nvidia.com/labs/toronto-ai/LLaMA-Mesh/
🕹️ Interactive Demo: Zhengyi/LLaMA-Mesh (courtesy of HuggingFace and Gradio)
📖 Full Paper: https://arxiv.org/abs/2411.09595
👨‍💻Code: https://github.com/nv-tlabs/LLaMa-Mesh
💾 Model Checkpoint: Zhengyi/LLaMA-Mesh
🧩 Blender Addon: https://github.com/huggingface/meshgen (courtesy of Dylan Ebert)
🎥 5-min Overview Video: https://youtu.be/eZNazN-1lPo?si=-idQa5aaceVw0Bbj (courtesy of AI Papers Academy)

reacted to YerbaPage's post with 👀 18 days ago

Post

1421

Curated list of **Repository-level Code Generation** papers & benchmarks! 🔥

Stay ahead with the latest in:
✅ Repo-level Issue Resolution
✅ Repo-level Code Completion
✅ Datasets & Benchmarks

👉 Check it out: https://github.com/YerbaPage/Awesome-Repo-Level-Code-Generation 🔥

reacted to wenhuach's post with 🔥👀 18 days ago

Post

1801

AutoRound has demonstrated strong results even at 2-bit precision for VLM models like QWEN2-VL-72B. Check it out here: OPEA/Qwen2-VL-72B-Instruct-int2-sym-inc.

4 replies