ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published 3 days ago • 43
AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation Paper • 2604.18240 • Published 12 days ago • 15
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents Paper • 2604.18543 • Published 12 days ago • 27
ClawEnvKit Collection Scalable Environment Generation for Claw-Like Agents • 3 items • Updated 10 days ago
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents Paper • 2604.18543 • Published 12 days ago • 27
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published 12 days ago • 81
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 17 days ago • 116
CocoaBench: Evaluating Unified Digital Agents in the Wild Paper • 2604.11201 • Published 19 days ago • 36
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 24 days ago • 95
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 23 days ago • 261
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 24 days ago • 323
Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published Feb 26 • 44