ServiceNow-AI/OpenResearcher-Dataset-SDFT-tools-system-prompt Viewer • Updated 12 days ago • 30k • 57 • 2
ServiceNow-AI/OpenResearcher-Dataset-SDFT-tools-system-prompt Viewer • Updated 12 days ago • 30k • 57 • 2
ServiceNow-AI/OpenResearcher-Dataset-rlvr_format-long-tools-system-prompt Viewer • Updated 12 days ago • 15k • 34 • 1
ServiceNow-AI/OpenResearcher-Dataset-rlvr_format-long-tools-system-prompt Viewer • Updated 12 days ago • 15k • 34 • 1
view article Article AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems Dec 23, 2025 • 48
GRAFT: GRaPH and Table Reasoning for Textual Alignment -- A Benchmark for Structured Instruction Following and Visual Reasoning Paper • 2508.15690 • Published Aug 21, 2025 • 8
view article Article SyGra: The One-Stop Framework for Building Data for LLMs and SLMs Sep 22, 2025 • 13
AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs Paper • 2509.08031 • Published Sep 9, 2025 • 21
How to Train Your LLM Web Agent: A Statistical Diagnosis Paper • 2507.04103 • Published Jul 5, 2025 • 52
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge Paper • 2411.19799 • Published Nov 29, 2024 • 16
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation Paper • 2504.07072 • Published Apr 9, 2025 • 9