Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published 3 days ago • 14
3AM: Segment Anything with Geometric Consistency in Videos Paper • 2601.08831 • Published 4 days ago • 31
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published 6 days ago • 202
OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent Paper • 2601.07779 • Published 5 days ago • 25
VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction Paper • 2601.05966 • Published 8 days ago • 21
NitroGen: An Open Foundation Model for Generalist Gaming Agents Paper • 2601.02427 • Published 13 days ago • 41
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 11 days ago • 121
KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs Paper • 2601.01046 • Published 14 days ago • 12
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published 15 days ago • 52
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation Paper • 2512.24724 • Published 17 days ago • 6
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published 29 days ago • 96
Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding Paper • 2512.17220 • Published 29 days ago • 111
Spatia: Video Generation with Updatable Spatial Memory Paper • 2512.15716 • Published about 1 month ago • 32
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published 24 days ago • 12
StoryMem: Multi-shot Long Video Storytelling with Memory Paper • 2512.19539 • Published 26 days ago • 17
Region-Constraint In-Context Generation for Instructional Video Editing Paper • 2512.17650 • Published 29 days ago • 50
IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning Paper • 2512.15635 • Published about 1 month ago • 19
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published about 1 month ago • 62
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation Paper • 2512.09363 • Published Dec 10, 2025 • 71