LMMs-Lab-Speedrun

community

https://www.lmms-lab.com/

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

xiangan authored a paper about 2 months ago

4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding

xiangan authored a paper about 2 months ago

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

xiangan submitted a paper about 2 months ago

LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence

View all activity

Organization Card

Community About org cards

NanoVLM Speedrun

The most striking thing about the modded-nanogpt experiments is that they expose how much of deep learning is just bloat. To apply this to Vision-Language Models (VLMs), you have to stop acting like a researcher and start acting like a hacker. You aren't trying to follow academic standards; you are trying to maximize the movement of bits through silicon. We introduce NanoVLM Speedrun: a minimalist VLM recipe designed to strip away the bloat. We provide the bare-minimum components required to bridge the training and evaluation pipeline, enabling lightning-fast iteration and reproduction.

The Recipe (2026H1)

LLM: Qwen/Qwen3-0.6B
Vision Encoder: google/siglip2-so400m-patch16-naflex
Projector: Classic LLaVA-style 2-layer MLP
Training Paradigm: A streamlined two-stage approach:
- Stage 1: Projector-only alignment (tuning the projector between vision and language).
- Stage 2: End-to-end instruction tuning (tuning both the projector and the LLM).

Data Preparation

We utilize the curated LMMs-Lab-Speedrun/Data_NanoVLM collection.

Stage 1: From liuhaotian/LLaVA-Pretrain
Stage 2: From lmms-lab/LLaVA-NeXT-Data (Note: We explicitly filtered out excessively long samples to maintain training efficiency).

For more information about training, please refer to NanoVLM Speedrun.

LMMs-Lab-Speedrun

AI & ML interests

Recent Activity

NanoVLM Speedrun

The Recipe (2026H1)

Data Preparation

models 1

LMMs-Lab-Speedrun/NanoVLM_Init

datasets 1

LMMs-Lab-Speedrun/Data_NanoVLM

AI & ML interests

Recent Activity

Team members 6

NanoVLM Speedrun

The Recipe (2026H1)

Data Preparation

models 1

datasets 1