LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding Paper • 2512.16229 • Published 12 days ago • 15
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published 13 days ago • 39
DEER: Draft with Diffusion, Verify with Autoregressive Models Paper • 2512.15176 • Published 13 days ago • 41
Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight Paper • 2511.16175 • Published Nov 20 • 12
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published Oct 20 • 122
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published Aug 14 • 145
Running on Zero 4 D2F LLaDA Instruct 8B 👁 4 Diffusion LLMs Can Do Faster-Than-AR Inference via Discret
Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing Paper • 2508.09192 • Published Aug 8 • 30
Running on Zero 4 D2F LLaDA Instruct 8B 👁 4 Diffusion LLMs Can Do Faster-Than-AR Inference via Discret
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks Paper • 2506.00411 • Published May 31 • 31
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions Paper • 2505.19949 • Published May 26 • 16
Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition Paper • 2505.19788 • Published May 26 • 13
MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection Paper • 2410.14731 • Published Oct 16, 2024
Improved Visual-Spatial Reasoning via R1-Zero-Like Training Paper • 2504.00883 • Published Apr 1 • 67
Improved Visual-Spatial Reasoning via R1-Zero-Like Training Paper • 2504.00883 • Published Apr 1 • 67 • 3