Harold Chen's picture

7 58 7

Harold Chen

Harold328

·

https://haroldchen19.github.io/

HaroldChen19

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 1 day ago

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

upvoted a paper 1 day ago

Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation

upvoted a paper 1 day ago

RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

View all activity

Organizations

None yet

upvoted 3 papers 1 day ago

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Paper • 2602.01630 • Published 4 days ago • 46

Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation

Paper • 2602.02214 • Published 3 days ago • 23

RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Paper • 2602.02488 • Published 3 days ago • 29

upvoted 5 papers 3 days ago

Visual Personalization Turing Test

Paper • 2601.22680 • Published 7 days ago • 2

Causal World Modeling for Robot Control

Paper • 2601.21998 • Published 7 days ago • 27

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published 6 days ago • 79

Show, Don't Tell: Morphing Latent Reasoning into Image Generation

Paper • 2602.02227 • Published 3 days ago • 10

LoopViT: Scaling Visual ARC with Looped Transformers

Paper • 2602.02156 • Published 3 days ago • 10

upvoted 2 papers 14 days ago

Rethinking Video Generation Model for the Embodied World

Paper • 2601.15282 • Published 15 days ago • 42

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published 18 days ago • 190

upvoted a paper 17 days ago

Future Optical Flow Prediction Improves Robot Control & Video Generation

Paper • 2601.10781 • Published 21 days ago • 19

upvoted 4 papers 19 days ago

Inference-time Physics Alignment of Video Generative Models with Latent World Models

Paper • 2601.10553 • Published 21 days ago • 12

Action100M: A Large-scale Video Action Dataset

Paper • 2601.10592 • Published 21 days ago • 28

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Paper • 2601.10611 • Published 21 days ago • 28

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

Paper • 2601.10061 • Published 22 days ago • 30

upvoted a paper 22 days ago

ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands

Paper • 2512.24965 • Published Dec 31, 2025 • 42

upvoted 4 papers 24 days ago

VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction

Paper • 2601.05966 • Published 27 days ago • 23

AgentOCR: Reimagining Agent History via Optical Self-Compression

Paper • 2601.04786 • Published 28 days ago • 29

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Paper • 2601.06002 • Published 27 days ago • 51

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published 28 days ago • 166