Bridging the Data Provenance Gap Across Text, Speech and Video Paper • 2412.17847 • Published 16 days ago • 7 • 2
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models Paper • 2412.02980 • Published Dec 4, 2024 • 12 • 3
$\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens Paper • 2402.13718 • Published Feb 21, 2024 • 1 • 2
Consent in Crisis: The Rapid Decline of the AI Data Commons Paper • 2407.14933 • Published Jul 20, 2024 • 12 • 3
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI Paper • 2310.16787 • Published Oct 25, 2023 • 5 • 2
Data Authenticity, Consent, & Provenance for AI are all broken: what will it take to fix them? Paper • 2404.12691 • Published Apr 19, 2024 • 2 • 2
Consent in Crisis: The Rapid Decline of the AI Data Commons Paper • 2407.14933 • Published Jul 20, 2024 • 12 • 3
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29, 2024 • 68 • 3
Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning Paper • 2404.12897 • Published Apr 19, 2024 • 1 • 2
Augmenting Language Models with Long-Term Memory Paper • 2306.07174 • Published Jun 12, 2023 • 18 • 5
Ring Attention with Blockwise Transformers for Near-Infinite Context Paper • 2310.01889 • Published Oct 3, 2023 • 10 • 3
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks Paper • 2403.00522 • Published Mar 1, 2024 • 44 • 4