view article Article Where should test-time compute go? Surprisal-guided selection in verifiable environments about 6 hours ago • 1
view article Article Where should test-time compute go? Surprisal-guided selection in verifiable environments about 6 hours ago • 1
Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents Paper • 2601.18217 • Published 13 days ago • 11
OpenSec: Measuring Incident Response Agent Calibration Under Adversarial Evidence Paper • 2601.21083 • Published 10 days ago • 1
OpenSec: Measuring Incident Response Agent Calibration Under Adversarial Evidence Paper • 2601.21083 • Published 10 days ago • 1