Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 8 days ago • 176
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper • 2601.09667 • Published Jan 14 • 91
DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL Paper • 2509.10446 • Published Sep 12, 2025 • 2
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published Jan 9 • 47
DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents Paper • 2602.07035 • Published 16 days ago • 30
TodoEvolve: Learning to Architect Agent Planning Systems Paper • 2602.07839 • Published 11 days ago • 6
VideoWorld 2: Learning Transferable Knowledge from Real-world Videos Paper • 2602.10102 • Published 9 days ago • 14
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model Paper • 2602.10098 • Published 9 days ago • 17
Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems Paper • 2602.08847 • Published 10 days ago • 24
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published 10 days ago • 65
GISA: A Benchmark for General Information-Seeking Assistant Paper • 2602.08543 • Published 10 days ago • 26
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery Paper • 2602.08990 • Published 10 days ago • 69
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining Paper • 2602.07085 • Published 13 days ago • 181
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 10 days ago • 256
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning Paper • 2602.07845 • Published 11 days ago • 68
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents Paper • 2602.06855 • Published 13 days ago • 70
LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 10 days ago • 66
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare Paper • 2602.06717 • Published 13 days ago • 71