Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO Paper • 2511.13288 • Published 29 days ago • 17
GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning Paper • 2511.11653 • Published Nov 10 • 55
MedReseacher-R1: Expert-Level Medical Deep Researcher via A Knowledge-Informed Trajectory Synthesis Framework Paper • 2508.14880 • Published Aug 20 • 15