view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family 6 days ago • 59
Router-Suggest: Dynamic Routing for Multimodal Auto-Completion in Visually-Grounded Dialogs Paper • 2601.05851 • Published 16 days ago • 2
Bolmo: Byteifying the Next Generation of Language Models Paper • 2512.15586 • Published Dec 17, 2025 • 17
view article Article Why You Should Care About Partial Differential Equations (PDEs) Dec 12, 2025 • 39
view post Post 4007 you gotta go fast and go read the latest blog by @ror et al. explaining Continuous Batching in depthhttps://huggingface.co/blog/continuous_batching See translation 👍 4 4 🧠 2 2 + Reply
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 133
view article Article LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR Oct 23, 2025 • 70
view article Article Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training +3 Aug 8, 2025 • 91
nablaNABLA: Neighborhood Adaptive Block-Level Attention Paper • 2507.13546 • Published Jul 17, 2025 • 125
view article Article Understanding Gemma 3n: How MatFormer Gives You Many Models in One Jun 26, 2025 • 49
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding Paper • 2506.16035 • Published Jun 19, 2025 • 88
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published May 22, 2025 • 34