12 30 32

Hou Pong (Ken) Chan

kenchan0226

https://kenchan0226.github.io

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

upvoted a paper 22 days ago

RynnVLA-002: A Unified Vision-Language-Action and World Model

upvoted a paper 22 days ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

View all activity

Organizations

upvoted a paper 14 days ago

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published 20 days ago • 150

upvoted 2 papers 22 days ago

RynnVLA-002: A Unified Vision-Language-Action and World Model

Paper • 2511.17502 • Published 24 days ago • 25

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published 26 days ago • 91

upvoted a paper 2 months ago

Large Language Models Do NOT Really Know What They Don't Know

Paper • 2510.09033 • Published Oct 10 • 16

upvoted a collection 2 months ago

LCO-Embedding Collections

Collection

2 items • Updated Oct 15 • 1

upvoted 2 papers 2 months ago

Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13 • 100

GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness

Paper • 2510.00536 • Published Oct 1 • 6

upvoted 3 papers 3 months ago

Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training

Paper • 2509.26625 • Published Sep 30 • 43

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Paper • 2509.21268 • Published Sep 25 • 103

GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning

Paper • 2509.17437 • Published Sep 22 • 17

upvoted an article 4 months ago

Article

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

Aug 11

•

upvoted 7 papers 5 months ago

Multimedia Generative Script Learning for Task Planning

Paper • 2208.12306 • Published Aug 25, 2022 • 2

Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning

Paper • 2312.10160 • Published Dec 15, 2023 • 2

PLANET: Dynamic Content Planning in Autoregressive Transformers for Long-form Text Generation

Paper • 2203.09100 • Published Mar 17, 2022 • 1

TempoSum: Evaluating the Temporal Generalization of Abstractive Summarization

Paper • 2305.01951 • Published May 3, 2023 • 2

VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning

Paper • 2507.22607 • Published Jul 30 • 46

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30 • 89

upvoted a paper 6 months ago

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Paper • 2506.09513 • Published Jun 11 • 101

upvoted a collection 6 months ago

Lingshu MLX