Jialong Liu

Galleons

Galleons2029

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Qwen3-VL Technical Report

upvoted a paper about 1 month ago

Memory in the Age of AI Agents

liked a Space about 1 month ago

HuggingFaceH4/on-policy-distillation

View all activity

Organizations

None yet

upvoted 2 papers about 1 month ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 152

Memory in the Age of AI Agents

Paper • 2512.13564 • Published Dec 15, 2025 • 147

liked a Space about 1 month ago

Unlocking On-Policy Distillation for Any Model Family

📝

Improve model performance by transferring knowledge between different model families

upvoted an article about 2 months ago

Article

Continuous batching from first principles

Nov 25, 2025

•

316

upvoted an article 5 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

752

commented on The 4 Things Qwen-3’s Chat Template Teaches Us 5 months ago

Great post， learned a lot！👍

upvoted an article 5 months ago

Article

The 4 Things Qwen-3’s Chat Template Teaches Us

Apr 30, 2025

•

liked a model 9 months ago

rasbt/llama-3.2-from-scratch

Updated Jun 12, 2025 • 283

upvoted an article 10 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

•

1.06k

upvoted an article 11 months ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

May 7, 2024

•

115

liked a Space 11 months ago

LLM训练终极指南 | The Ultra-Scale Playbook

🔥

256

了解LLM训练的方方面面

liked 3 models about 1 year ago

liked a model over 1 year ago

nvidia/Llama3-ChatQA-1.5-70B

Text Generation • 71B • Updated May 24, 2024 • 217 • • 333

Jialong Liu

AI & ML interests

Recent Activity

Organizations

Galleons's activity

Unlocking On-Policy Distillation for Any Model Family

Continuous batching from first principles

SmolLM3: smol, multilingual, long-context reasoner

The 4 Things Qwen-3’s Chat Template Teaches Us

Mixture of Experts Explained

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

LLM训练终极指南 | The Ultra-Scale Playbook