Papers
arxiv:2603.25823

ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

Published on Mar 26
· Submitted by
Vincent Han
on Apr 2
#3 Paper of the day
Authors:
,
,
,
,
,
,
,
,

Abstract

ViGoR benchmark addresses limitations in current AIGC evaluation by introducing a comprehensive framework for assessing visual generative reasoning across multiple modalities and cognitive dimensions.

AI-generated summary

Beneath the stunning visual fidelity of modern AIGC models lies a "logical desert", where systems fail tasks that require physical, causal, or complex spatial reasoning. Current evaluations largely rely on superficial metrics or fragmented benchmarks, creating a ``performance mirage'' that overlooks the generative process. To address this, we introduce ViGoR Vision-G}nerative Reasoning-centric Benchmark), a unified framework designed to dismantle this mirage. ViGoR distinguishes itself through four key innovations: 1) holistic cross-modal coverage bridging Image-to-Image and Video tasks; 2) a dual-track mechanism evaluating both intermediate processes and final results; 3) an evidence-grounded automated judge ensuring high human alignment; and 4) granular diagnostic analysis that decomposes performance into fine-grained cognitive dimensions. Experiments on over 20 leading models reveal that even state-of-the-art systems harbor significant reasoning deficits, establishing ViGoR as a critical ``stress test'' for the next generation of intelligent vision models. The demo have been available at https://vincenthancoder.github.io/ViGoR-Bench/

Community

Paper submitter

1C3F672B-4879-4D84-923D-85F4DC5F27BA

Interesting breakdown of this paper on arXivLens: https://arxivlens.com/PaperView/Details/vigor-bench-how-far-are-visual-generative-models-from-zero-shot-visual-reasoners-6197-7f3bb8bb
Covers the executive summary, detailed methodology, and practical applications.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2603.25823
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.25823 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.25823 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.25823 in a Space README.md to link it from this page.

Collections including this paper 1