WAN 2.2 Video Generation Model (Stage 1 Pretrained)

Stage 1 pretrained WAN 2.2 (5B) video generation model for Motus. This checkpoint provides the video generation backbone trained on Multi-Robot Task Trajectory, Synthetic Robot Data, and Egocentric Human Videos.

Homepage | GitHub | arXiv | Feishu | WeChat

WAN 2.2 Video Generation Model (Stage 1 Pretrained)

Model Details

Architecture

Component	Specification
Base Model	WAN 2.2
Parameters	5B
Precision	`bfloat16`

Training Details

Stage: Stage 1 (VGM Training)
Training Data: Multi-Robot Task Trajectory, Synthetic Robot Data, Egocentric Human Videos
Objective: Text-conditioned image-to-video generation (TI2V)

Hardware & Software Requirements

Mode	VRAM	Example GPU
Inference	~ 16 GB	RTX 4090
Fine-Tuning	~ 40 GB	A100 (40GB)

Usage in Motus

Configuration

Update your Motus config file (e.g., configs/robotwin.yaml):

model:
  wan:
    checkpoint_path: "./pretrained_models/Motus_Wan2_2_5B_pretrain"  # This checkpoint
    config_path: "./pretrained_models/Motus_Wan2_2_5B_pretrain"
    vae_path: "./pretrained_models/Wan2.2-TI2V-5B/Wan2.2_VAE.pth"  # Local VAE (not included)
    precision: "bfloat16"

Download

# Using Hugging Face CLI
huggingface-cli download motus-robotics/Motus_Wan2_2_5B_pretrain --local-dir ./pretrained_models/Motus_Wan2_2_5B_pretrain

# Or using Git LFS
git lfs install
git clone https://huggingface.co/motus-robotics/Motus_Wan2_2_5B_pretrain

Note on VAE

The WAN VAE (Wan2.2_VAE.pth) is not included in this repository. You need to:

Download the original WAN 2.2 VAE from Wan-Video/Wan2.2
Set the vae_path in your config to point to the local VAE file

Citation

@misc{bi2025motusunifiedlatentaction,
            title={Motus: A Unified Latent Action World Model}, 
            author={Hongzhe Bi and Hengkai Tan and Shenghao Xie and Zeyuan Wang and Shuhe Huang and Haitian Liu and Ruowen Zhao and Yao Feng and Chendong Xiang and Yinze Rong and Hongyan Zhao and Hanyu Liu and Zhizhong Su and Lei Ma and Hang Su and Jun Zhu},
            year={2025},
            eprint={2512.13030},
            archivePrefix={arXiv},
            primaryClass={cs.CV},
            url={https://arxiv.org/abs/2512.13030}, 
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

5B params

Tensor type

BF16

Paper for motus-robotics/Motus_Wan2_2_5B_pretrain

Motus: A Unified Latent Action World Model

Paper • 2512.13030 • Published 25 days ago

motus-robotics
/

Motus_Wan2_2_5B_pretrain