ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 123
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published 4 days ago • 23
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 2 days ago • 28
Running on A100 125 Music Flamingo 🎵 125 Upload music or YouTube videos and ask detailed questions about them
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 8 days ago • 82
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published 8 days ago • 38
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 56