43,151 research outputs found
Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation
The control of nonlinear dynamical systems remains a major challenge for
autonomous agents. Current trends in reinforcement learning (RL) focus on
complex representations of dynamics and policies, which have yielded impressive
results in solving a variety of hard control tasks. However, this new
sophistication and extremely over-parameterized models have come with the cost
of an overall reduction in our ability to interpret the resulting policies. In
this paper, we take inspiration from the control community and apply the
principles of hybrid switching systems in order to break down complex dynamics
into simpler components. We exploit the rich representational power of
probabilistic graphical models and derive an expectation-maximization (EM)
algorithm for learning a sequence model to capture the temporal structure of
the data and automatically decompose nonlinear dynamics into stochastic
switching linear dynamical systems. Moreover, we show how this framework of
switching models enables extracting hierarchies of Markovian and
auto-regressive locally linear controllers from nonlinear experts in an
imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro
Advances in Learning Bayesian Networks of Bounded Treewidth
This work presents novel algorithms for learning Bayesian network structures
with bounded treewidth. Both exact and approximate methods are developed. The
exact method combines mixed-integer linear programming formulations for
structure learning and treewidth computation. The approximate method consists
in uniformly sampling -trees (maximal graphs of treewidth ), and
subsequently selecting, exactly or approximately, the best structure whose
moral graph is a subgraph of that -tree. Some properties of these methods
are discussed and proven. The approaches are empirically compared to each other
and to a state-of-the-art method for learning bounded treewidth structures on a
collection of public data sets with up to 100 variables. The experiments show
that our exact algorithm outperforms the state of the art, and that the
approximate approach is fairly accurate.Comment: 23 pages, 2 figures, 3 table
Optimal treatment allocations in space and time for on-line control of an emerging infectious disease
A key component in controlling the spread of an epidemic is deciding where, whenand to whom to apply an intervention.We develop a framework for using data to informthese decisionsin realtime.We formalize a treatment allocation strategy as a sequence of functions, oneper treatment period, that map up-to-date information on the spread of an infectious diseaseto a subset of locations where treatment should be allocated. An optimal allocation strategyoptimizes some cumulative outcome, e.g. the number of uninfected locations, the geographicfootprint of the disease or the cost of the epidemic. Estimation of an optimal allocation strategyfor an emerging infectious disease is challenging because spatial proximity induces interferencebetween locations, the number of possible allocations is exponential in the number oflocations, and because disease dynamics and intervention effectiveness are unknown at outbreak.We derive a Bayesian on-line estimator of the optimal allocation strategy that combinessimulation–optimization with Thompson sampling.The estimator proposed performs favourablyin simulation experiments. This work is motivated by and illustrated using data on the spread ofwhite nose syndrome, which is a highly fatal infectious disease devastating bat populations inNorth America
- …