2 4 8

Max Belitsky

mbelitsky

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

commented on a paper 4 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

upvoted a paper 4 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

View all activity

Organizations

None yet

upvoted a paper 8 days ago

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning

Paper • 2602.11149 • Published 8 days ago • 12

commented a paper 4 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

Paper • 2510.13876 • Published Oct 13, 2025 • 11 •

upvoted a paper 4 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

Paper • 2510.13876 • Published Oct 13, 2025 • 11

authored a paper 7 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11, 2025 • 40

commented a paper 7 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11, 2025 • 40 •

upvoted a paper 7 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11, 2025 • 40

liked 4 datasets 9 months ago

upvoted a paper 9 months ago

Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation

Paper • 2505.06027 • Published May 9, 2025 • 18

liked a Space 12 months ago

The Ultra-Scale Playbook

🌌

3.7k

The ultimate guide to training LLM on large GPU Clusters

liked 3 datasets about 1 year ago

truthfulqa/truthful_qa

Viewer • Updated Jan 4, 2024 • 1.63k • 61.1k • 273

abacusai/MetaMathFewshot

Viewer • Updated Jan 17, 2024 • 395k • 140 • 27

AwesomeEmerald/OpenSpatialLogic

Viewer • Updated Apr 4, 2024 • 36 • 9 • 8

updated a dataset almost 2 years ago

mbelitsky/wikipedia_subset

Viewer • Updated May 14, 2024 • 1.04M • 13

Max Belitsky

AI & ML interests

Recent Activity

Organizations

mbelitsky's activity

The Ultra-Scale Playbook