Submitted by akhaliq 47 MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models · 8 authors 187 3
Submitted by akhaliq 39 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions · 30 authors 76 13
Submitted by akhaliq 34 LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness · 5 authors 2
Submitted by akhaliq 33 Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction · 9 authors 783 2
Submitted by akhaliq 25 Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction · 5 authors 84 5
Submitted by AIRobotZ 12 The Imperative of Conversation Analysis in the Era of LLMs: A Survey of Tasks, Techniques, and Trends · 6 authors 2
Submitted by davanstrien 11 Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling · 3 authors 2
Submitted by akhaliq 10 Disco4D: Disentangled 4D Human Generation and Animation from a Single Image · 6 authors 2
Submitted by akhaliq 8 Robot See Robot Do: Imitating Articulated Object Manipulation with Monocular 4D Reconstruction · 7 authors 157 2
Submitted by SushantGautam 8 Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case Study · 5 authors 2