2 92 28

xiang huang

xianghuang

AI & ML interests

None yet

Recent Activity

liked a dataset 15 days ago

MichiganNLP/Chumor

upvoted a paper 23 days ago

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents

published a dataset about 2 months ago

xianghuang/Sysbank

View all activity

Organizations

None yet

liked a dataset 15 days ago

MichiganNLP/Chumor

Viewer • Updated Dec 24, 2024 • 3.34k • 40 • 7

upvoted a paper 23 days ago

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents

Paper • 2507.03112 • Published Jul 3, 2025 • 32

published a dataset about 2 months ago

xianghuang/Sysbank

Updated Nov 17, 2025 • 2

upvoted a paper 3 months ago

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

Paper • 2505.23923 • Published May 29, 2025 • 8

upvoted 16 papers 4 months ago

Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning

Paper • 2505.17813 • Published May 23, 2025 • 58

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

Paper • 2505.11896 • Published May 17, 2025 • 58

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Paper • 2504.21776 • Published Apr 30, 2025 • 59

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20, 2025 • 62

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published May 7, 2025 • 65

SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Paper • 2505.19641 • Published May 26, 2025 • 68

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5, 2025 • 79

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12, 2025 • 82

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26, 2025 • 92

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Paper • 2505.15277 • Published May 21, 2025 • 104

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28, 2025 • 131

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 188

xiang huang

AI & ML interests

Recent Activity

Organizations

xianghuang's activity