arxiv:2504.16828
Muhammad Khalifa
mkhalifa
AI & ML interests
natural language genration, reinforcement learning
Recent Activity
upvoted a paper about 2 hours ago
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges liked a dataset 2 days ago
nvidia/Nemotron-Personas-Korea updated a dataset 7 days ago
launch/thinkprm-1K-verification-cots