RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning Paper • 2603.09160 • Published 8 days ago • 13
SkillOrchestra: Learning to Route Agents via Skill Transfer Paper • 2602.19672 • Published 23 days ago • 56
Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning Paper • 2506.04723 • Published Jun 5, 2025 • 1
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training Paper • 2505.00358 • Published May 1, 2025 • 26
Evaluating Language Model Context Windows: A "Working Memory" Test and Inference-time Correction Paper • 2407.03651 • Published Jul 4, 2024 • 17