MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16 • 26
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16 • 26
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16 • 26 • 1
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality Paper • 2507.07202 • Published Jul 9 • 24
SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems Paper • 2506.07564 • Published Jun 9 • 6
mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation Paper • 2505.24073 • Published May 29
GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution Paper • 2505.00687 • Published May 1
Demystifying the Visual Quality Paradox in Multimodal Large Language Models Paper • 2506.15645 • Published Jun 18 • 4
Demystifying the Visual Quality Paradox in Multimodal Large Language Models Paper • 2506.15645 • Published Jun 18 • 4
SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems Paper • 2506.07564 • Published Jun 9 • 6