Post
1139
Hands-on testing of HY-World 2.0 shows a significant improvement in end-to-end engineering maturity compared to version 1.5
The model supports direct multimodal input from text, single-frame images, and video. Inference can be launched without camera intrinsic/extrinsic calibration or additional preprocessing
After panorama generation, the built-in Spatial Agent automatically performs semantic navigation path planning. Combined with spatial consistency constraints from HY-WorldStereo, it ensures artifact-free multi-view generation and stable geometric alignment
Outputs include standard 3D asset formats such as Mesh, 3DGS, and point clouds, which can be directly imported into Unity/UE
It is suitable for engineering scenarios including game level prototyping, digital twins, and embodied simulation
The model supports direct multimodal input from text, single-frame images, and video. Inference can be launched without camera intrinsic/extrinsic calibration or additional preprocessing
After panorama generation, the built-in Spatial Agent automatically performs semantic navigation path planning. Combined with spatial consistency constraints from HY-WorldStereo, it ensures artifact-free multi-view generation and stable geometric alignment
Outputs include standard 3D asset formats such as Mesh, 3DGS, and point clouds, which can be directly imported into Unity/UE
It is suitable for engineering scenarios including game level prototyping, digital twins, and embodied simulation