1 2 1

jiayanmu PRO

wangbuer999

AI & ML interests

None yet

Recent Activity

reacted to theirpost with 🚀 about 12 hours ago

Hands-on testing of HY-World 2.0 shows a significant improvement in end-to-end engineering maturity compared to version 1.5 The model supports direct multimodal input from text, single-frame images, and video. Inference can be launched without camera intrinsic/extrinsic calibration or additional preprocessing After panorama generation, the built-in Spatial Agent automatically performs semantic navigation path planning. Combined with spatial consistency constraints from HY-WorldStereo, it ensures artifact-free multi-view generation and stable geometric alignment Outputs include standard 3D asset formats such as Mesh, 3DGS, and point clouds, which can be directly imported into Unity/UE It is suitable for engineering scenarios including game level prototyping, digital twins, and embodied simulation

posted an update about 13 hours ago

reacted to Benedictat's post with 🚀 about 13 hours ago

Hunyuan HY-World 2.0 Open-Sourced | Unified SOTA for 3D Generation / Reconstruction / Simulation HY-World 2.0 is a unified 3D world model supporting multimodal inputs including text and images. Its end-to-end framework simultaneously performs 3D understanding, scene generation, and geometric reconstruction. Based on HY-Pano-2.0, the model enables panorama generation without camera parameters It ensures geometric consistency via spatial agents and trajectory planning, and achieves joint 3DGS & Mesh representation with WorldMirror 2.0, reaching SOTA performance in novel view synthesis and 3D reconstruction Unlike Genie 3 and HY-World 1.5, which only output videos, HY-World 2.0 directly generates editable 3D assets, better meeting real-world research and simulation demands

View all activity

Organizations

Posts 4

Post

1139

Hands-on testing of HY-World 2.0 shows a significant improvement in end-to-end engineering maturity compared to version 1.5

The model supports direct multimodal input from text, single-frame images, and video. Inference can be launched without camera intrinsic/extrinsic calibration or additional preprocessing

After panorama generation, the built-in Spatial Agent automatically performs semantic navigation path planning. Combined with spatial consistency constraints from HY-WorldStereo, it ensures artifact-free multi-view generation and stable geometric alignment

Outputs include standard 3D asset formats such as Mesh, 3DGS, and point clouds, which can be directly imported into Unity/UE

It is suitable for engineering scenarios including game level prototyping, digital twins, and embodied simulation

Post

2650

HunyuanImage 3.0-Instruct just dropped

fresh -sourceImage 3.0model! Spent 20 mins testing it on a Messi + retro scrambler fusion case

Ran on diffusers v0.26.3 + CUDA 12.1 | 8B MoE params (1.3B activated) | zero VRAM issues

strength=0.9 Messi #10 kit/tattoo sharp, moto’s rusted metal texture blurred (classic open-source pain)
strength=0.7 Moto/cobblestone background crisp, Messi’s jersey details faded completely

strength=0.75 + prompt "Blend seamlessly, keep all original details": both subject & background sharp
No ControlNet, no manual masking the model’s chain-of-thought reasoning parses image+prompt first
Already outperforms Qwen-Image-Edit 2511 (GSB eval +25.7% on single-image edits) | 100% open-source

👉 Repo: https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=Hunyuan-Image-3.0-Instruct

technical report：https://arxiv.org/abs/2509.23951

Anyone else struggled with strength tweaks for fusion? This fixed it for my Messi+moto case did it work as well for yours?

View all Posts

spaces 1

Wang

💻

models 0

None public yet

datasets 0

None public yet