arxiv:2604.14518

Mind DeepResearch Technical Report

Published on Apr 17

· Submitted by

Biao Wang on Apr 20

LiAuto

Upvote

Authors:

Li Auto Inc

Abstract

MindDR is an efficient multi-agent deep research framework that achieves high performance through a collaborative three-agent architecture and specialized four-stage training pipeline, demonstrating strong results on multiple benchmarks.

AI-generated summary

We present Mind DeepResearch (MindDR), an efficient multi-agent deep research framework that achieves leading performance with only ~30B-parameter models through a meticulously designed data synthesis and multi-stage training pipeline. The core innovation of MindDR lies in a collaborative three-agent architecture (Planning Agent, DeepSearch Agent, and Report Agent) and a four-stage agent-specialized training pipeline comprising SFT cold-start, Search-RL, Report-RL and preference alignment. With this regime, MindDR demonstrates competitive performance even with ~30B-scale models. Specifically, MindDR achieves 45.7% on BrowseComp-ZH, 42.8% on BrowseComp, 46.5% on WideSearch, 75.0% on xbench-DS, and 52.5 on DeepResearch Bench, outperforming comparable-scale open-source agent systems and rivaling larger-scale models. MindDR has been deployed as an online product in Li Auto. Furthermore, we introduce MindDR Bench, a curated benchmark of 500 real-world Chinese queries from our internal product user interactions, evaluated through a comprehensive multi-dimensional rubric system rather than relying on a single RACE metric. On MindDR Bench, MindDR achieves a state-of-the-art score of 51.8.

View arXiv page View PDF Add to collection

Community

JustinWang824

Paper author Paper submitter 5 days ago

Mind Deep Research (MindDR) is an efficient multi-agent framework that achieves high performance on deep search and deep research tasks with relevant low cost. It breaks down end-to-end RL training into multi-stage search-rl, report-rl and preference alignment training pipeline for better efficiency and training stability. Check it out for details!

librarian-bot

5 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

avahal

3 days ago

minddr's triad of Planning, DeepSearch, and Report agents is the part that makes a 30B model behave with long-horizon research flow instead of a brittle prompt chain. the four-stage curriculum—sft cold-start, search-rl, report-rl, and preference alignment—feels like a pragmatic antidote to end-to-end rl headaches, but i’m curious how dependent the gains are on the knowledge graph quality and the synthetic data mix. would love to see an ablation where you remove the planning agent or clamp its influence to test whether most of the lift comes from task decomposition versus the rl signals themselves. btw, the arxivlens breakdown helped me parse the method details, a solid walkthrough that covers the multi-agent coordination and the extended chain-of-thought vibe, https://arxivlens.com/PaperView/Details/mind-deepresearch-technical-report-6253-93b815b6. overall, the cost-aware modular design that still competes with larger models is compelling, curious how this approach transfers to non-chinese domains or more dynamic research tasks.