Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published Apr 1 • 50
Austin362667/Qwen3-0.6B-MLX-bf16-python-5k-alpaca-resampled-Qwen-4B Text Generation • 0.6B • Updated Mar 16 • 55 •
Austin362667/Qwen3-0.6B-MLX-bf16-python-5k-alpaca-resampled-Qwen-4B Text Generation • 0.6B • Updated Mar 16 • 55 •
Austin362667/python_code_instructions_5_alpaca_qwen3_4B_resampled Viewer • Updated Mar 15 • 5.01k • 37
Austin362667/python_code_instructions_5_alpaca_qwen3_4B_resampled Viewer • Updated Mar 15 • 5.01k • 37
view article Article Assisted Generation: a new direction toward low-latency text generation joaogante • May 11, 2023 • 78
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes ybelkada, timdettmers • Aug 17, 2022 • 132
view article Article OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve codelion • May 20, 2025 • 66
view article Article KV Cache from scratch in nanoVLM +3 ariG23498, kashif, lusxvr, andito, pcuenq • Jun 4, 2025 • 119