2 research outputs found
Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation
Making decisions is a great challenge in distributed autonomous environments
due to enormous state spaces and uncertainty. Many online planning algorithms
rely on statistical sampling to avoid searching the whole state space, while
still being able to make acceptable decisions. However, planning often has to
be performed under strict computational constraints making online planning in
multi-agent systems highly limited, which could lead to poor system
performance, especially in stochastic domains. In this paper, we propose
Emergent Value function Approximation for Distributed Environments (EVADE), an
approach to integrate global experience into multi-agent online planning in
stochastic domains to consider global effects during local planning. For this
purpose, a value function is approximated online based on the emergent system
behaviour by using methods of reinforcement learning. We empirically evaluated
EVADE with two statistical multi-agent online planning algorithms in a highly
complex and stochastic smart factory environment, where multiple agents need to
process various items at a shared set of machines. Our experiments show that
EVADE can effectively improve the performance of multi-agent online planning
while offering efficiency w.r.t. the breadth and depth of the planning process.Comment: Accepted at AAMAS 201