126,275 research outputs found

    Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems.

    No full text
    International audienceIn the framework of fully cooperative multi-agent systems, independent (non-communicative) agents that learn by reinforcement must overcome several difficulties to manage to coordinate. This paper identifies several challenges responsible for the non-coordination of independent agents: Pareto-selection, nonstationarity, stochasticity, alter-exploration and shadowed equilibria. A selection of multi-agent domains is classified according to those challenges: matrix games, Boutilier's coordination game, predators pursuit domains and a special multi-state game. Moreover the performance of a range of algorithms for independent reinforcement learners is evaluated empirically. Those algorithms are Q-learning variants: decentralized Q-learning, distributed Q-learning, hysteretic Q-learning, recursive FMQ and WoLF PHC. An overview of the learning algorithms' strengths and weaknesses against each challenge concludes the paper and can serve as a basis for choosing the appropriate algorithm for a new domain. Furthermore, the distilled challenges may assist in the design of new learning algorithms that overcome these problems and achieve higher performance in multi-agent applications

    STRATEGY MANAGEMENT IN A MULTI-AGENT SYSTEM USING NEURAL NETWORKS FOR INDUCTIVE AND EXPERIENCE-BASED LEARNING

    Get PDF
    Intelligent agents and multi-agent systems prove to be a promising paradigm for solving problems in a distributed, cooperative way. Neural networks are a classical solution for ensuring the learning ability of agents. In this paper, we analyse a multi-agent system where agents use different training algorithms and different topologies for their neural networks, which they use to solve classification and regression problems provided by a user. Out of the three training algorithms under investigation, Backpropagation, Quickprop and Rprop, the first demonstrates inferior performance to the other two when considered in isolation. However, by optimizing the strategy of accepting or rejecting tasks, Backpropagation agents succeed in outperforming the other types of agents in terms of the total utility gained. This strategy is learned also with a neural network, by processing the results of past experiences. Therefore, we show a way in which agents can use neural network models for both external purposes and internal ones.agents, learning, neural networks, strategy management multi-agent system.

    Multi-task Deep Reinforcement Learning with PopArt

    Full text link
    The reinforcement learning community has made great strides in designing algorithms capable of exceeding human performance on specific tasks. These algorithms are mostly trained one task at the time, each new task requiring to train a brand new agent instance. This means the learning algorithm is general, but each solution is not; each agent can only solve the one task it was trained on. In this work, we study the problem of learning to master not one but multiple sequential-decision tasks at once. A general issue in multi-task learning is that a balance must be found between the needs of multiple tasks competing for the limited resources of a single learning system. Many learning algorithms can get distracted by certain tasks in the set of tasks to solve. Such tasks appear more salient to the learning process, for instance because of the density or magnitude of the in-task rewards. This causes the algorithm to focus on those salient tasks at the expense of generality. We propose to automatically adapt the contribution of each task to the agent's updates, so that all tasks have a similar impact on the learning dynamics. This resulted in state of the art performance on learning to play all games in a set of 57 diverse Atari games. Excitingly, our method learned a single trained policy - with a single set of weights - that exceeds median human performance. To our knowledge, this was the first time a single agent surpassed human-level performance on this multi-task domain. The same approach also demonstrated state of the art performance on a set of 30 tasks in the 3D reinforcement learning platform DeepMind Lab
    • 

    corecore