Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment Paper • 2505.10597 • Published May 15
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values Paper • 2504.05535 • Published Apr 7 • 44
nvidia/Nemotron-RL-instruction_following-structured_outputs Viewer • Updated 26 days ago • 9.44k • 718 • 17
google/code_x_glue_cc_clone_detection_big_clone_bench Viewer • Updated Jan 24, 2024 • 1.73M • 4.38k • 20
TGPR: Tree-Guided Policy Refinement for Robust Self-Debugging of LLMs Paper • 2510.06878 • Published Oct 8
FML-bench: A Benchmark for Automatic ML Research Agents Highlighting the Importance of Exploration Breadth Paper • 2510.10472 • Published Oct 12 • 8
Scientific Algorithm Discovery by Augmenting AlphaEvolve with Deep Research Paper • 2510.06056 • Published Oct 7 • 5
RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback Paper • 2510.06186 • Published Oct 7
AlphaResearch: Accelerating New Algorithm Discovery with Language Models Paper • 2511.08522 • Published 30 days ago • 15
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents Paper • 2310.19923 • Published Oct 30, 2023 • 14
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 192
arcee-ai/cleaned-mlabonne-distilabel-truthy-dpo-v0.1-filtered Viewer • Updated Jun 18, 2024 • 663 • 24
Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch Paper • 2512.02395 • Published 9 days ago • 45
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs Paper • 2511.19773 • Published 16 days ago • 9
ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use Paper • 2510.27363 • Published Oct 31 • 22
Ariadne: A Controllable Framework for Probing and Extending VLM Reasoning Boundaries Paper • 2511.00710 • Published Nov 1 • 4
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models Paper • 2510.01623 • Published Oct 2 • 10
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search Paper • 2510.12801 • Published Oct 14 • 13
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published Oct 24 • 99
Open Multimodal Retrieval-Augmented Factual Image Generation Paper • 2510.22521 • Published Oct 26 • 30
TeichAI/gemini-3-pro-preview-high-reasoning-1000x Viewer • Updated about 24 hours ago • 1.02k • 487 • 11