SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models Paper • 2405.08317 • Published May 14, 2024 • 9
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs Paper • 2404.16873 • Published Apr 21, 2024 • 28
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts Paper • 2402.16822 • Published Feb 26, 2024 • 15
Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks Paper • 2409.00137 • Published Aug 29, 2024
GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks Paper • 2409.19521 • Published Sep 29, 2024