Dhruv Diddi's picture

Dhruv Diddi PRO

ddiddi

·

AI & ML interests

None yet

Recent Activity

new activity 13 days ago

Xenova/codegen-350M-mono:Transformers.js v3 ONNX weights

upvoted a paper 17 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

liked a Space 17 days ago

akhaliq/anychat

View all activity

Organizations

ddiddi's activity

upvoted a paper 17 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 21 days ago • 136

upvoted a paper 19 days ago

Large Action Models: From Inception to Implementation

Paper • 2412.10047 • Published 22 days ago • 31

upvoted 2 papers 21 days ago

PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations

Paper • 2412.05994 • Published 26 days ago • 17

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 22 days ago • 92

upvoted a paper 23 days ago

MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views

Paper • 2412.06767 • Published 25 days ago • 6

upvoted a paper 25 days ago

VisionZip: Longer is Better but Not Necessary in Vision Language Models

Paper • 2412.04467 • Published 29 days ago • 105

upvoted 2 papers 28 days ago

SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance

Paper • 2412.02687 • Published Dec 3, 2024 • 108

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published about 1 month ago • 119

upvoted 4 papers about 1 month ago

UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing

Paper • 2411.16781 • Published Nov 25, 2024 • 10

AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset

Paper • 2411.15640 • Published Nov 23, 2024 • 4

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 47

Material Anything: Generating Materials for Any 3D Object via Diffusion

Paper • 2411.15138 • Published Nov 22, 2024 • 42

upvoted a paper about 2 months ago

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published Nov 5, 2024 • 64

upvoted 5 papers 2 months ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 89

Can Knowledge Editing Really Correct Hallucinations?

Paper • 2410.16251 • Published Oct 21, 2024 • 54

ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

Paper • 2410.17856 • Published Oct 23, 2024 • 49

LLM-based Optimization of Compound AI Systems: A Survey

Paper • 2410.16392 • Published Oct 21, 2024 • 14

Aligning Large Language Models via Self-Steering Optimization

Paper • 2410.17131 • Published Oct 22, 2024 • 21

upvoted 2 papers 3 months ago

Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts

Paper • 2410.10626 • Published Oct 14, 2024 • 38

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 6