Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published Dec 30, 2024 • 36
Efficient Inference for Large Reasoning Models: A Survey Paper • 2503.23077 • Published Mar 29, 2025 • 46
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence Paper • 2503.20533 • Published Mar 26, 2025 • 12