MRRC: Multi-Agent Reinforcement Learning with Rectification Capability in Cooperative Tasks

EasyChair Preprint 11117

15 pages•Date: October 23, 2023

Sheng Yu, Wei Zhu, Shuhong Liu, Zhengwen Gong and Haoran Chen

Abstract

Motivated by the centralised training with decentralised execution (CTDE) paradigm, multi-agent reinforcement learning (MARL) algorithms have made significant strides in addressing cooperative tasks. However, the challenges of sparse environmental rewards and limited scalability have impeded further advancements in MARL. In response, MRRC, a novel actor-critic-based approach is proposed. MRRC tackles the sparse reward problem by equipping each agent with both an individual policy and a cooperative policy, harnessing the benefits of the individual policy’s rapid convergence and the cooperative policy’s global optimality. To enhance scalability, MRRC employs a monotonic mix network to rectify the state value function Q for each agent, yielding the joint value function Q_tot to facilitate global updates of the entire critic network. Additionally, the Gumbel-Softmax technique is introduced to rectify discrete actions, enabling MRRC to handle discrete tasks effectively. By comparing MRRC with advanced baseline algorithms in the "Predator-Prey" and challenging "SMAC" environments, as well as conducting ablation experiments, the superior performance of MRRC is demonstrated in this study. The experimental results reveal the efficacy of MRRC in reward-sparse environments and its ability to scale well with increasing numbers of agents.

Keyphrases: Cooperative task, Individual reward rectification, Monotonic mix function, multi-agent reinforcement learning

Links:

https://easychair.org/publications/preprint/GDlb

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:11117,
  author    = {Sheng Yu and Wei Zhu and Shuhong Liu and Zhengwen Gong and Haoran Chen},
  title     = {MRRC: Multi-Agent Reinforcement Learning with Rectification Capability in Cooperative Tasks},
  howpublished = {EasyChair Preprint 11117},
  year      = {EasyChair, 2023}}

Download PDF Open PDF in browser