UniWeTok: An Unified Binary Tokenizer with Codebook Size 2^{128} for Unified Multimodal Large Language Model Paper • 2602.14178 • Published 13 days ago • 12
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 13 days ago • 50
BitDance Collection BitDance: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model. • 11 items • Updated 6 days ago • 9
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation Paper • 2511.20256 • Published Nov 25, 2025 • 28
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper • 2507.13348 • Published Jul 17, 2025 • 79
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling Paper • 2505.11196 • Published May 16, 2025 • 14
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation Paper • 2410.18666 • Published Oct 24, 2024 • 19