4 4

Beichen Zhang

ToheartZhang

https://scholar.google.com/citations?user=GWyWUt4AAAAJ&hl=en#/

ToheartZhang

AI & ML interests

LLM for Reasoning

Recent Activity

upvoted a paper 1 day ago

SWE-Universe: Scale Real-World Verifiable Environments to Millions

upvoted a paper 6 months ago

Agentic Reinforced Policy Optimization

upvoted a paper 6 months ago

Group Sequence Policy Optimization

View all activity

Organizations

None yet

upvoted a paper 1 day ago

SWE-Universe: Scale Real-World Verifiable Environments to Millions

Paper • 2602.02361 • Published 2 days ago • 46

upvoted 2 papers 6 months ago

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26, 2025 • 158

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

New activity in a-m-team/AM-DeepSeek-R1-0528-Distilled 8 months ago

Discovered a very strange problem

🔥 1

#5 opened 8 months ago by

Night-Quiet

commented a paper 10 months ago

Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models

Paper • 2503.21380 • Published Mar 27, 2025 • 38 •

upvoted a paper 10 months ago

Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models

Paper • 2503.21380 • Published Mar 27, 2025 • 38

authored a paper 11 months ago

An Empirical Study on Eliciting and Improving R1-like Reasoning Models

Paper • 2503.04548 • Published Mar 6, 2025 • 9

updated a dataset about 1 year ago

RUC-AIBOX/STILL-3-Preview-RL-Data

Viewer • Updated Jan 26, 2025 • 29.9k • 55 • 14

New activity in ToheartZhang/JiuZhang3.0-Synthesis-7B over 1 year ago

The prompts is 404

#1 opened over 1 year ago by

MengboZhou

updated 4 models over 1 year ago

updated 2 collections over 1 year ago

JiuZhang3.0

Collection

A series of models for math reasoning. • 4 items • Updated May 26, 2024

JiuZhang3.0-Corpus