Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 8 days ago • 124
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 12 days ago • 149
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published 1 day ago • 31
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 8 days ago • 124
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 12 days ago • 149
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published 27 days ago • 126
Stochastic CHAOS: Why Deterministic Inference Kills, and Distributional Variability Is the Heartbeat of Artifical Cognition Paper • 2601.07239 • Published 30 days ago • 3
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published about 1 month ago • 211
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision Paper • 2601.03193 • Published Jan 6 • 47
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision Paper • 2601.03193 • Published Jan 6 • 47
Active Intelligence in Video Avatars via Closed-loop World Modeling Paper • 2512.20615 • Published Dec 23, 2025 • 9
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published Dec 9, 2025 • 119
Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection Paper • 2512.16905 • Published Dec 18, 2025 • 32