1,560 research outputs found
Coordinated Multi-Agent Imitation Learning
We study the problem of imitation learning from demonstrations of multiple
coordinating agents. One key challenge in this setting is that learning a good
model of coordination can be difficult, since coordination is often implicit in
the demonstrations and must be inferred as a latent variable. We propose a
joint approach that simultaneously learns a latent coordination model along
with the individual policies. In particular, our method integrates unsupervised
structure learning with conventional imitation learning. We illustrate the
power of our approach on a difficult problem of learning multiple policies for
fine-grained behavior modeling in team sports, where different players occupy
different roles in the coordinated team strategy. We show that having a
coordination model to infer the roles of players yields substantially improved
imitation loss compared to conventional baselines.Comment: International Conference on Machine Learning 201
Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints
We propose a novel master-slave architecture to solve the top-
combinatorial multi-armed bandits problem with non-linear bandit feedback and
diversity constraints, which, to the best of our knowledge, is the first
combinatorial bandits setting considering diversity constraints under bandit
feedback. Specifically, to efficiently explore the combinatorial and
constrained action space, we introduce six slave models with distinguished
merits to generate diversified samples well balancing rewards and constraints
as well as efficiency. Moreover, we propose teacher learning based optimization
and the policy co-training technique to boost the performance of the multiple
slave models. The master model then collects the elite samples provided by the
slave models and selects the best sample estimated by a neural contextual
UCB-based network to make a decision with a trade-off between exploration and
exploitation. Thanks to the elaborate design of slave models, the co-training
mechanism among slave models, and the novel interactions between the master and
slave models, our approach significantly surpasses existing state-of-the-art
algorithms in both synthetic and real datasets for recommendation tasks. The
code is available at:
\url{https://github.com/huanghanchi/Master-slave-Algorithm-for-Top-K-Bandits}.Comment: IEEE Transactions on Neural Networks and Learning System
Networking - A Statistical Physics Perspective
Efficient networking has a substantial economic and societal impact in a
broad range of areas including transportation systems, wired and wireless
communications and a range of Internet applications. As transportation and
communication networks become increasingly more complex, the ever increasing
demand for congestion control, higher traffic capacity, quality of service,
robustness and reduced energy consumption require new tools and methods to meet
these conflicting requirements. The new methodology should serve for gaining
better understanding of the properties of networking systems at the macroscopic
level, as well as for the development of new principled optimization and
management algorithms at the microscopic level. Methods of statistical physics
seem best placed to provide new approaches as they have been developed
specifically to deal with non-linear large scale systems. This paper aims at
presenting an overview of tools and methods that have been developed within the
statistical physics community and that can be readily applied to address the
emerging problems in networking. These include diffusion processes, methods
from disordered systems and polymer physics, probabilistic inference, which
have direct relevance to network routing, file and frequency distribution, the
exploration of network structures and vulnerability, and various other
practical networking applications.Comment: (Review article) 71 pages, 14 figure
- …