Longxu Dou

dreamerdeo

https://longxudou.github.io/

AI & ML interests

Natural Language Processing

Recent Activity

upvoted a paper 3 days ago

On Data Engineering for Scaling LLM Terminal Capabilities

liked a dataset 5 days ago

zai-org/terminal-bench-2-verified

upvoted a paper 5 days ago

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

View all activity

Organizations

upvoted a paper 3 days ago

On Data Engineering for Scaling LLM Terminal Capabilities

Paper • 2602.21193 • Published 3 days ago • 87

liked a dataset 5 days ago

zai-org/terminal-bench-2-verified

Updated about 11 hours ago • 5.85k • 58

upvoted a paper 5 days ago

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

Paper • 2602.17684 • Published 23 days ago • 21

upvoted a paper 23 days ago

Rethinking the Trust Region in LLM Reinforcement Learning

Paper • 2602.04879 • Published 23 days ago • 35

liked a dataset 3 months ago

Danau5tin/terminal-tasks

Viewer • Updated Sep 12, 2025 • 331 • 15 • 7

upvoted a paper 3 months ago

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

Paper • 2307.13269 • Published Jul 25, 2023 • 34

authored 2 papers 3 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 129

Training Optimal Large Diffusion Language Models

Paper • 2510.03280 • Published Sep 28, 2025

upvoted 3 papers 5 months ago

upvoted a collection 5 months ago

cwm

Collection

Collection for Code World Model, an agentic coding model from FAIR. • 3 items • Updated Sep 24, 2025 • 18

upvoted a paper 8 months ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25, 2025 • 47

updated a Space 8 months ago

README

💻

upvoted 5 papers 9 months ago

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27, 2025 • 26

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published May 28, 2025 • 29

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26, 2025 • 24

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Paper • 2505.13438 • Published May 19, 2025 • 36

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19, 2025 • 45

upvoted an article 10 months ago

Article

Accelerating LLM Inference: Fast Sampling with Gumbel-Max Trick

Oct 24, 2024

•