Model Overview
This repository hosts the pretrained parameters for the SuperMat project, as described in "SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates" (ICCV 2025)
| Model | Description |
|---|---|
| supermat.pth | Base SuperMat model for material decomposition |
| supermat_mv.pth | Multi-view version of SuperMat processing six orthogonal views |
| uv_refine_bc.pth | UV refinement network for albedo materials |
| uv_refine_rm.pth | UV refinement network for roughness & metallic materials |
All models are built upon the base model stabilityai/stable-diffusion-2-1.
Note: The official stabilityai/stable-diffusion-2-1 model has been removed. You may need to obtain the base model parameters through alternative sources, such as sd2-community/stable-diffusion-2-1.
Model Details
SuperMat (supermat.pth)
The core model for material decomposition. It takes RGBA images as input and decomposes materials from the target object.
SuperMat Multi-View (supermat_mv.pth)
An extended version that processes six orthogonal views simultaneously. This model leverages multi-view consistency for improved material estimation. For each view, the camera-to-world (c2w) matrix is provided as camera embeddings.
UV Refinement Networks
Two specialized networks for refining UV maps:
- uv_refine_bc.pth: Refines the UV map for albedo materials
- uv_refine_rm.pth: Refines the UV map for roughness & metallic materials
Download & Usage
Download the desired model(s) from this repository and place them in the checkpoints folder:
checkpoints/
βββ supermat.pth
βββ supermat_mv.pth
βββ uv_refine_bc.pth
βββ uv_refine_rm.pth
The models are independent of each other, so you only need to download those required for your specific inference task.
Input Requirements
Image Format
- SuperMat models expect RGBA images where only the target object appears as foreground, with alpha values set to
0for all other regions - During inference, the input image is alpha-composited with a gray background
(0.5, 0.5, 0.5)
Resolution Preferences
- SuperMat models:
512Γ512resolution (recommended) - UV refinement networks:
1024Γ1024resolution (recommended)
Multi-View Specific Requirements
For the multi-view model:
- All inputs for a single case should be organized in one folder
- Input images must follow the naming convention as shown in
examples/bag_rendered_6views - Camera information is stored in
meta.json(refer to the example for the required format with c2w matrices)
Quick Inference Examples
SuperMat Single-Image
python inference_supermat.py \
--input examples/ring_rendered_2views \
--output-dir outputs \
--checkpoint checkpoints/supermat.pth \
--base-model sd2-community/stable-diffusion-2-1 \
--device cuda:0 \
--image-size 512
SuperMat Multi-View
python inference_supermat_mv.py \
--input examples/bag_rendered_6views \
--output-dir outputs_mv \
--checkpoint checkpoints/supermat_mv.pth \
--base-model sd2-community/stable-diffusion-2-1 \
--device cuda:0 \
--image-size 512 \
--num_views 6 \
--use-camera-embeds
UV Refinement (Albedo)
python inference_uv_refine.py \
--input-uv examples/axe_uv/uv_bc.png \
--input-uv-position examples/axe_uv/uv_position.png \
--input-uv-mask examples/axe_uv/uv_mask.png \
--output-dir outputs_uv_bc \
--checkpoint checkpoints/uv_refine_bc.pth \
--base-model sd2-community/stable-diffusion-2-1 \
--device cuda:0 \
--image-size 1024
For complete usage instructions, please refer to the main repository.
Citation
If you find these models useful in your research, please cite:
@inproceedings{hong2025supermat,
title={Supermat: Physically consistent pbr material estimation at interactive rates},
author={Hong, Yijia and Guo, Yuan-Chen and Yi, Ran and Chen, Yulong and Cao, Yan-Pei and Ma, Lizhuang},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={25083--25093},
year={2025}
}
Model tree for oyiya/SuperMat
Base model
sd2-community/stable-diffusion-2-1