PUSA V1.0: Surpassing Wan-I2V with $500 Training Cost by Vectorized Timestep Adaptation Paper • 2507.16116 • Published Jul 22, 2025 • 13
GSFixer: Improving 3D Gaussian Splatting with Reference-Guided Video Diffusion Priors Paper • 2508.09667 • Published Aug 13, 2025 • 6
MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis Paper • 2510.07190 • Published Oct 8, 2025 • 1
SKEL-CF: Coarse-to-Fine Biomechanical Skeleton and Surface Mesh Recovery Paper • 2511.20157 • Published Nov 25, 2025 • 3
EmoCAST: Emotional Talking Portrait via Emotive Text Description Paper • 2508.20615 • Published Aug 28, 2025
MLLM-4D: Towards Visual-based Spatial-Temporal Intelligence Paper • 2603.00515 • Published Feb 28 • 2
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization Paper • 2603.29664 • Published 4 days ago • 44
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization Paper • 2603.29664 • Published 4 days ago • 44
PersonaLive! Expressive Portrait Image Animation for Live Streaming Paper • 2512.11253 • Published Dec 12, 2025 • 40
GenCompositor: Generative Video Compositing with Diffusion Transformer Paper • 2509.02460 • Published Sep 2, 2025 • 26
LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation Paper • 2309.09294 • Published Sep 17, 2023
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos Paper • 2304.01186 • Published Apr 3, 2023
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models Paper • 2310.07702 • Published Oct 11, 2023
High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net Paper • 2308.14221 • Published Aug 27, 2023 • 1
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation Paper • 2211.12194 • Published Nov 22, 2022
VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild Paper • 2211.14758 • Published Nov 27, 2022 • 2
T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations Paper • 2301.06052 • Published Jan 15, 2023
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing Paper • 2301.06281 • Published Jan 16, 2023
Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models Paper • 2407.10285 • Published Jul 14, 2024 • 5