Papers
arxiv:2604.14518

Mind DeepResearch Technical Report

Published on Apr 17
· Submitted by
Biao Wang
on Apr 20
Authors:
,

Abstract

MindDR is an efficient multi-agent deep research framework that achieves high performance through a collaborative three-agent architecture and specialized four-stage training pipeline, demonstrating strong results on multiple benchmarks.

AI-generated summary

We present Mind DeepResearch (MindDR), an efficient multi-agent deep research framework that achieves leading performance with only ~30B-parameter models through a meticulously designed data synthesis and multi-stage training pipeline. The core innovation of MindDR lies in a collaborative three-agent architecture (Planning Agent, DeepSearch Agent, and Report Agent) and a four-stage agent-specialized training pipeline comprising SFT cold-start, Search-RL, Report-RL and preference alignment. With this regime, MindDR demonstrates competitive performance even with ~30B-scale models. Specifically, MindDR achieves 45.7% on BrowseComp-ZH, 42.8% on BrowseComp, 46.5% on WideSearch, 75.0% on xbench-DS, and 52.5 on DeepResearch Bench, outperforming comparable-scale open-source agent systems and rivaling larger-scale models. MindDR has been deployed as an online product in Li Auto. Furthermore, we introduce MindDR Bench, a curated benchmark of 500 real-world Chinese queries from our internal product user interactions, evaluated through a comprehensive multi-dimensional rubric system rather than relying on a single RACE metric. On MindDR Bench, MindDR achieves a state-of-the-art score of 51.8.

Community

Paper author Paper submitter

Mind Deep Research (MindDR) is an efficient multi-agent framework that achieves high performance on deep search and deep research tasks with relevant low cost. It breaks down end-to-end RL training into multi-stage search-rl, report-rl and preference alignment training pipeline for better efficiency and training stability. Check it out for details!

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

minddr's triad of Planning, DeepSearch, and Report agents is the part that makes a 30B model behave with long-horizon research flow instead of a brittle prompt chain. the four-stage curriculum—sft cold-start, search-rl, report-rl, and preference alignment—feels like a pragmatic antidote to end-to-end rl headaches, but i’m curious how dependent the gains are on the knowledge graph quality and the synthetic data mix. would love to see an ablation where you remove the planning agent or clamp its influence to test whether most of the lift comes from task decomposition versus the rl signals themselves. btw, the arxivlens breakdown helped me parse the method details, a solid walkthrough that covers the multi-agent coordination and the extended chain-of-thought vibe, https://arxivlens.com/PaperView/Details/mind-deepresearch-technical-report-6253-93b815b6. overall, the cost-aware modular design that still competes with larger models is compelling, curious how this approach transfers to non-chinese domains or more dynamic research tasks.

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.14518
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.14518 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.14518 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.14518 in a Space README.md to link it from this page.

Collections including this paper 1