Jorge Cortés
Professor
Cymer Corporation Endowed Chair
Exploiting bias for cooperative planning in
multi-agent tree search
A. Ma, M. Ouimet, J. Cortés
IEEE Robotics and Automation Letters 5 (2) (2020), 1819-1826
Abstract
Graph search over states and actions is a valuable tool
for robotic planning and navigation. However, the
required computation is sensitive to the size of the
state and action spaces, a fact which is further
exacerbated in multi-agent planning by the number of
agents and the presence of sparse reward signals
dependent on the cooperation of agents. To tackle these
problems, we introduce an algorithm that is pre-trained
in a centralized fashion but implemented on robots in a
distributed way at runtime. The centralized portion uses
imitation learning to iteratively construct policies
that help guide an individual agent's own runtime search
as well as predict other agents' future actions by
exploiting previously discovered joint actions. Our
algorithm includes a novel method of tree search based
on a mixture of the individual and joint action space,
which can be interpreted as a cascading effect where
agents are biased by exploration of new actions,
exploitation of previously profitable ones, and
recommendation provided by deep neural nets. Simulations
show the efficacy of the proposed method in cooperative
scenarios with sparse rewards.
pdf
Mechanical and Aerospace Engineering,
University of California, San Diego
9500 Gilman Dr,
La Jolla, California, 92093-0411
Ph: 1-858-822-7930
Fax: 1-858-822-3107
cortes at ucsd.edu
Skype id:
jorgilliyo