RSD models - a jaeh8nkim Collection

jaeh8nkim 's Collections

RSD models

updated Sep 30, 2025

jaeh8nkim/s1K1p1-Distillepoch15-Qwen3-0.6B

0.6B • Updated Jul 22, 2025 • 1

Note Qwen3-0.6B + s1K-1.1
jaeh8nkim/s1Kteacher8KUP-Distill-Qwen3-0.6B

0.6B • Updated Aug 29, 2025 • 2

Note Qwen3-0.6B + Teacher-generated
jaeh8nkim/s1Kstudent8KUP-Distill-Qwen3-0.6B

0.6B • Updated Aug 28, 2025 • 3

Note Qwen3-0.6B + Self-distill
jaeh8nkim/s1Kskd8KUP-Distill-Qwen3-0.6B

0.6B • Updated Sep 10, 2025 • 2

Note Qwen3-0.6B + SKD-inspired
jaeh8nkim/s1K4Q3p6BUPFTstep1prob10-Distillepoch15-Qwen3-0.6B

0.6B • Updated Aug 1, 2025 • 2

Note Qwen3-0.6B + RSD-generated (p_th=10%)
jaeh8nkim/s4Qp6UPst1pr3ep15-Distill-Qwen3-0.6B

0.6B • Updated Aug 17, 2025 • 1

Note Qwen3-0.6B + RSD-generated (p_th=3%)
jaeh8nkim/s1K4Q3p6Bs1p17BtUPFTstep1epoch15-Distill-Qwen3-0.6B

0.6B • Updated Jul 19, 2025 • 1

Note Qwen3-0.6B + RSD-generated (p_th=1%)
jaeh8nkim/s4Qp6UPst1pr03ep15-Distill-Qwen3-0.6B

0.6B • Updated Aug 20, 2025 • 1

Note Qwen3-0.6B + RSD-generated (p_th=0.3%)
jaeh8nkim/s1K4Q317UP-Distill-Qwen3-1.7B

2B • Updated Sep 12, 2025 • 2

Note Qwen3-1.7B + RSD-generated (p_th=1%) tailored for Qwen3-1.7B
jaeh8nkim/s1K4L321UP-Distill-Llama-3.2-1B-Instruct

1B • Updated Sep 15, 2025 • 2

Note Llama-3.2-1B-Instruct + RSD-generated (p_th=1%) tailored for Llama-3.2-1B-Instruct
jaeh8nkim/s1Kstudent203UP-Distill-Qwen3-0.6B

0.6B • Updated Sep 22, 2025 • 6

Note Qwen3-0.6B + Self-distill (203 rejection sampling attempts)