CollabVR: Collaborative Video Reasoning with Vision-Language and Video Generation Models Paper • 2605.08735 • Published 4 days ago • 57
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs Paper • 2605.09063 • Published 4 days ago • 68
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling Paper • 2605.08083 • Published 5 days ago • 59
AcademiClaw: When Students Set Challenges for AI Agents Paper • 2605.02661 • Published 9 days ago • 16
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 10 days ago • 152
AI Co-Mathematician: Accelerating Mathematicians with Agentic AI Paper • 2605.06651 • Published 6 days ago • 13
Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key Paper • 2605.06638 • Published 6 days ago • 13
SkillOS: Learning Skill Curation for Self-Evolving Agents Paper • 2605.06614 • Published 6 days ago • 38