SPEAR Collection Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601] • 14 items • Updated Dec 4, 2025 • 2
SmartSnap Collection Data and Checkpoints of "SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents" [arxiv.org/abs/2512.22322] • 7 items • Updated 25 days ago • 3
Youtu-Agent RL Collection The checkpoints of the models trained with Youtu-Agent RL for Code/Math and Search tasks. • 3 items • Updated 13 days ago • 3
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Paper • 2512.24615 • Published 24 days ago • 117 • 5
Youtu-Agent RL Collection The checkpoints of the models trained with Youtu-Agent RL for Code/Math and Search tasks. • 3 items • Updated 13 days ago • 3
Youtu-Agent RL Collection The checkpoints of the models trained with Youtu-Agent RL for Code/Math and Search tasks. • 3 items • Updated 13 days ago • 3
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents Paper • 2512.22322 • Published 28 days ago • 39 • 5
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Paper • 2512.24615 • Published 24 days ago • 117
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Paper • 2512.24615 • Published 24 days ago • 117