Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
Paper
•
2601.09667
•
Published
•
76
None defined yet.
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs