vllm data PiTe: Pixel-Temporal Alignment for Large Video-Language Model Paper • 2409.07239 • Published Sep 11, 2024 • 12
PiTe: Pixel-Temporal Alignment for Large Video-Language Model Paper • 2409.07239 • Published Sep 11, 2024 • 12
speech LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published Sep 10, 2024 • 56
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published Sep 10, 2024 • 56