Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 4 days ago • 59
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report Paper • 2601.21051 • Published 5 days ago • 12
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published Dec 8, 2025 • 38
Quark Quantized PTPC FP8 Models Collection PTPC model quantized by quark • 9 items • Updated 18 days ago
Instella ✨ Collection Announcing Instella, a series of 3 billion parameter language models developed by AMD, trained from scratch on 128 Instinct MI300X GPUs. • 13 items • Updated Dec 5, 2025 • 10
Instella: Fully Open Language Models with Stellar Performance Paper • 2511.10628 • Published Nov 13, 2025 • 5 • 2