SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14, 2025 • 158
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2, 2024 • 25
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 9 days ago • 156
NaturalFunctions Collection LLMs fine tuned for function calling 🤖 • 2 items • Updated Jan 28, 2024 • 3
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5, 2025 • 249
view article Article SmolLM - blazingly fast and remarkably powerful +1 loubnabnl, anton-l, eliebak • Jul 16, 2024 • 455
view article Article 🧑⚖️ "Replacing Judges with Juries" using distilabel alvarobartt • May 3, 2024 • 17
view article Article CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models +14 r34p3r1321, csahana95, liyueam10, cynikolai, dwjsong, simonwan, fa7pdn, is-eqv, yaohway, dhavalkapil, dmolnar, spencerwmeta, jdsaxe, vontimitta, carljparker, clefourrier • May 24, 2024 • 22
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Paper • 2401.04081 • Published Jan 8, 2024 • 74
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper • 2307.16789 • Published Jul 31, 2023 • 102