25 research outputs found
Learning to Communicate: A Machine Learning Framework for Heterogeneous Multi-Agent Robotic Systems
We present a machine learning framework for multi-agent systems to learn both
the optimal policy for maximizing the rewards and the encoding of the high
dimensional visual observation. The encoding is useful for sharing local visual
observations with other agents under communication resource constraints. The
actor-encoder encodes the raw images and chooses an action based on local
observations and messages sent by the other agents. The machine learning agent
generates not only an actuator command to the physical device, but also a
communication message to the other agents. We formulate a reinforcement
learning problem, which extends the action space to consider the communication
action as well. The feasibility of the reinforcement learning framework is
demonstrated using a 3D simulation environment with two collaborating agents.
The environment provides realistic visual observations to be used and shared
between the two agents.Comment: AIAA SciTech 201
Game Plan: What AI can do for Football, and What Football can do for AI
The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented
analytics possibilities in various team and individual sports, including baseball, basketball, and
tennis. More recently, AI techniques have been applied to football, due to a huge increase in
data collection by professional teams, increased computational power, and advances in machine
learning, with the goal of better addressing new scientific challenges involved in the analysis of
both individual players’ and coordinated teams’ behaviors. The research challenges associated
with predictive and prescriptive football analytics require new developments and progress at the
intersection of statistical learning, game theory, and computer vision. In this paper, we provide
an overarching perspective highlighting how the combination of these fields, in particular, forms a
unique microcosm for AI research, while offering mutual benefits for professional teams, spectators,
and broadcasters in the years to come. We illustrate that this duality makes football analytics
a game changer of tremendous value, in terms of not only changing the game of football itself,
but also in terms of what this domain can mean for the field of AI. We review the state-of-theart and exemplify the types of analysis enabled by combining the aforementioned fields, including
illustrative examples of counterfactual analysis using predictive models, and the combination of
game-theoretic analysis of penalty kicks with statistical learning of player attributes. We conclude
by highlighting envisioned downstream impacts, including possibilities for extensions to other sports
(real and virtual)
Meta-learning of Sequential Strategies
In this report we review memory-based meta-learning as a tool for building
sample-efficient strategies that learn from past experience to adapt to any
task within a target class. Our goal is to equip the reader with the conceptual
foundations of this tool for building new, scalable agents that operate on
broad domains. To do so, we present basic algorithmic templates for building
near-optimal predictors and reinforcement learners which behave as if they had
a probabilistic model that allowed them to efficiently exploit task structure.
Furthermore, we recast memory-based meta-learning within a Bayesian framework,
showing that the meta-learned strategies are near-optimal because they amortize
Bayes-filtered data, where the adaptation is implemented in the memory dynamics
as a state-machine of sufficient statistics. Essentially, memory-based
meta-learning translates the hard problem of probabilistic sequential inference
into a regression problem.Comment: DeepMind Technical Report (15 pages, 6 figures
Bayes-Adaptive Simulation-based Search with Value Function Approximation
Bayes-adaptive planning offers a principled solution to the exploration-exploitation trade-off under model uncertainty. It finds the optimal policy in belief space, which explicitly accounts for the expected effect on future rewards of reductions in uncertainty. However, the Bayes-adaptive solution is typically intractable in domains with large or continuous state spaces. We present a tractable method for approximating the Bayes-adaptive solution by combining simulation-based search with a novel value function approximation technique that generalises over belief space. Our method outperforms prior approaches in both discrete bandit tasks and simple continuous navigation and control tasks
Filtering variational objectives
When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results. Inspired by this, we consider the extension of the ELBO to a family of lower bounds defined by a particle filter’s estimator of the marginal likelihood, the filtering variational objectives (FIVOs). FIVOs take the same arguments as the ELBO, but can exploit a model’s sequential structure to form tighter bounds. We present results that relate the tightness of FIVO’s bound to the variance of the particle filter’s estimator by considering the generic case of bounds defined as log-transformed likelihood estimators. Experimentally, we show that training with FIVO results in substantial improvements over training with the ELBO on sequential data. </p