CoPE-VideoLM: Codec Primitives For Efficient Video Language Models
Paper
• 2602.13191 • Published
• 29
3D Computer Vision, Semantic Understanding, SLAM, Multi-modal Interactions, Spatiotemporal Reasoning, VLMs/LLMs
GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer
ReSpace: Text-Driven 3D Scene Synthesis and Editing with Preference Alignment