Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization Paper • 2602.23008 • Published 5 days ago • 33
Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making Paper • 2310.03022 • Published Oct 4, 2023 • 1
Penalizing Infeasible Actions and Reward Scaling in Reinforcement Learning with Offline Data Paper • 2507.08761 • Published Jul 11, 2025 • 1
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5, 2025 • 136
ReflAct: World-Grounded Decision Making in LLM Agents via Goal-State Reflection Paper • 2505.15182 • Published May 21, 2025 • 6