Let ViT Speak: Generative Language-Image Pre-training Paper • 2605.00809 • Published 14 days ago • 32
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 161
Running on CPU Upgrade Featured 3.17k The Smol Training Playbook 📚 3.17k The secrets to building world-class LLMs
Visual Representation Alignment for Multimodal Large Language Models Paper • 2509.07979 • Published Sep 9, 2025 • 84
BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions Paper • 2411.07461 • Published Nov 12, 2024 • 23