UNC NLP

university

https://nlp.cs.unc.edu/

unc-nlp

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

j-min authored a paper about 2 months ago

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

j-min authored a paper about 2 months ago

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

j-min authored a paper 3 months ago

Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation

View all activity

unc-nlp's activity

j-min

authored 2 papers about 2 months ago

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 28

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published Nov 22, 2024 • 9

j-min

authored 2 papers 3 months ago

Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation

Paper • 2304.06671 • Published Apr 13, 2023

DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback

Paper • 2410.06215 • Published Oct 8, 2024

j-min

authored 13 papers 9 months ago

TVLT: Textless Vision-Language Transformer

Paper • 2209.14156 • Published Sep 28, 2022

Unifying Vision-and-Language Tasks via Text Generation

Paper • 2102.02779 • Published Feb 4, 2021

Self-Chained Image-Language Model for Video Localization and Question Answering

Paper • 2305.06988 • Published May 11, 2023

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Models

Paper • 2202.04053 • Published Feb 8, 2022

Fine-grained Image Captioning with CLIP Reward

Paper • 2205.13115 • Published May 26, 2022 • 1

Visual Programming for Text-to-Image Generation and Evaluation

Paper • 2305.15328 • Published May 24, 2023

VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks

Paper • 2112.06825 • Published Dec 13, 2021

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer

Paper • 2107.02681 • Published Jul 6, 2021

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

Paper • 2403.12014 • Published Mar 18, 2024

Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation

Paper • 2310.18235 • Published Oct 27, 2023

SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data

Paper • 2403.06952 • Published Mar 11, 2024

DOCCI: Descriptions of Connected and Contrasting Images

Paper • 2404.19753 • Published Apr 30, 2024 • 13

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Paper • 2404.09967 • Published Apr 15, 2024 • 21

airsplay

authored a paper about 1 year ago

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Paper • 2311.09217 • Published Nov 15, 2023 • 21

j-min

authored 2 papers over 1 year ago

DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning

Paper • 2310.12128 • Published Oct 18, 2023

VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning

Paper • 2309.15091 • Published Sep 26, 2023 • 32