Ankit

Ajax0564

Ajax0564

AI & ML interests

NLP

Recent Activity

upvoted an article 4 days ago

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

upvoted a paper 13 days ago

Router-Suggest: Dynamic Routing for Multimodal Auto-Completion in Visually-Grounded Dialogs

upvoted a paper about 1 month ago

Bolmo: Byteifying the Next Generation of Language Models

View all activity

Organizations

None yet

upvoted an article 4 days ago

Article

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

6 days ago

•

upvoted a paper 13 days ago

Router-Suggest: Dynamic Routing for Multimodal Auto-Completion in Visually-Grounded Dialogs

Paper • 2601.05851 • Published 16 days ago • 2

upvoted a paper about 1 month ago

Bolmo: Byteifying the Next Generation of Language Models

Paper • 2512.15586 • Published Dec 17, 2025 • 17

upvoted an article about 1 month ago

Article

Why You Should Care About Partial Differential Equations (PDEs)

Dec 12, 2025

•

reacted to sergiopaniego's post with 👍 2 months ago

Post

4007

you gotta go fast and go read the latest blog by @ror et al. explaining Continuous Batching in depth

https://huggingface.co/blog/continuous_batching

upvoted an article 2 months ago

Article

Continuous batching from first principles

Nov 25, 2025

•

311

upvoted a paper 2 months ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

upvoted an article 3 months ago

Article

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Oct 23, 2025

•

upvoted 2 papers 4 months ago

AToken: A Unified Tokenizer for Vision

Paper • 2509.14476 • Published Sep 17, 2025 • 36

SAIL-VL2 Technical Report

Paper • 2509.14033 • Published Sep 17, 2025 • 44

upvoted an article 6 months ago

Article

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

Aug 8, 2025

•

upvoted 2 papers 6 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

nablaNABLA: Neighborhood Adaptive Block-Level Attention

Paper • 2507.13546 • Published Jul 17, 2025 • 125

upvoted an article 6 months ago

Article

Understanding Gemma 3n: How MatFormer Gives You Many Models in One

Jun 26, 2025

•

upvoted 3 papers 7 months ago

Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2, 2025 • 130

Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29, 2025 • 61

Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published Jun 19, 2025 • 88

upvoted an article 7 months ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

Jun 12, 2025

•

151

upvoted 2 papers 8 months ago

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

Paper • 2505.16933 • Published May 22, 2025 • 34

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 97

Ankit

AI & ML interests

Recent Activity

Organizations

Ajax0564's activity

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

Why You Should Care About Partial Differential Equations (PDEs)

Continuous batching from first principles

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

Understanding Gemma 3n: How MatFormer Gives You Many Models in One

Learn the Hugging Face Kernel Hub in 5 Minutes