Ranjay Krishna's picture

1

Ranjay Krishna

ranjaykrishna

·

http://ranjaykrishna.com

AI & ML interests

None yet

Recent Activity

authored a paper 26 days ago

Perception Tokens Enhance Visual Reasoning in Multimodal Language Models

authored a paper about 1 month ago

Negative Token Merging: Image-based Adversarial Feature Guidance

authored a paper about 1 month ago

Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment

View all activity

Organizations

None yet

ranjaykrishna's activity

authored a paper 26 days ago

Perception Tokens Enhance Visual Reasoning in Multimodal Language Models

Paper • 2412.03548 • Published Dec 4, 2024 • 17

authored 3 papers about 1 month ago

Negative Token Merging: Image-based Adversarial Feature Guidance

Paper • 2412.01339 • Published Dec 2, 2024 • 22

Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment

Paper • 2411.17188 • Published Nov 26, 2024 • 21

One Diffusion to Generate Them All

Paper • 2411.16318 • Published Nov 25, 2024 • 26

authored 2 papers 3 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18, 2024 • 36

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 106

authored 2 papers 5 months ago

Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model

Paper • 2408.00754 • Published Aug 1, 2024 • 21

Efficient Inference of Vision Instruction-Following Models with Elastic Cache

Paper • 2407.18121 • Published Jul 25, 2024 • 17

authored 2 papers 6 months ago

Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions

Paper • 2407.06723 • Published Jul 9, 2024 • 11

Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps

Paper • 2407.07071 • Published Jul 9, 2024 • 12

authored a paper 7 months ago

Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

Paper • 2406.09403 • Published Jun 13, 2024 • 19