Sarah Thompson

crimsonFalcon91

AI & ML interests

None yet

Recent Activity

liked a model 14 days ago

upvoted a paper 14 days ago

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

upvoted a paper 26 days ago

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

View all activity

Organizations

None yet

crimsonFalcon91's activity

liked a model 14 days ago

answerdotai/ModernBERT-base

Fill-Mask • Updated 14 days ago • 104k • 624

upvoted a paper 14 days ago

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

Paper • 2412.18597 • Published 15 days ago • 19

upvoted 12 papers 26 days ago

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

Paper • 2410.16266 • Published Oct 21, 2024 • 4

xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs

Paper • 2410.16267 • Published Oct 21, 2024 • 17

Mitigating Object Hallucination via Concentric Causal Attention

Paper • 2410.15926 • Published Oct 21, 2024 • 16

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published Oct 22, 2024 • 45

SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

Paper • 2410.17249 • Published Oct 22, 2024 • 41

LLM-based Optimization of Compound AI Systems: A Survey

Paper • 2410.16392 • Published Oct 21, 2024 • 14

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22, 2024 • 14

Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes

Paper • 2410.16930 • Published Oct 22, 2024 • 6

Improve Vision Language Model Chain-of-thought Reasoning

Paper • 2410.16198 • Published Oct 21, 2024 • 22

liked a dataset 26 days ago

foursquare/fsq-os-places

Viewer • Updated Dec 3, 2024 • 105M • 2.67k • 64

liked a model 26 days ago

meta-llama/Llama-3.3-70B-Instruct

Text Generation • Updated 18 days ago • 409k • • 1.53k

upvoted 2 papers 26 days ago

Phi-4 Technical Report

Paper • 2412.08905 • Published 27 days ago • 98

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 27 days ago • 92