21,193 research outputs found
Sequential Gaussian Processes for Online Learning of Nonstationary Functions
Many machine learning problems can be framed in the context of estimating
functions, and often these are time-dependent functions that are estimated in
real-time as observations arrive. Gaussian processes (GPs) are an attractive
choice for modeling real-valued nonlinear functions due to their flexibility
and uncertainty quantification. However, the typical GP regression model
suffers from several drawbacks: i) Conventional GP inference scales
with respect to the number of observations; ii) updating a GP model
sequentially is not trivial; and iii) covariance kernels often enforce
stationarity constraints on the function, while GPs with non-stationary
covariance kernels are often intractable to use in practice. To overcome these
issues, we propose an online sequential Monte Carlo algorithm to fit mixtures
of GPs that capture non-stationary behavior while allowing for fast,
distributed inference. By formulating hyperparameter optimization as a
multi-armed bandit problem, we accelerate mixing for real time inference. Our
approach empirically improves performance over state-of-the-art methods for
online GP estimation in the context of prediction for simulated non-stationary
data and hospital time series data
Lifelong Generative Modeling
Lifelong learning is the problem of learning multiple consecutive tasks in a
sequential manner, where knowledge gained from previous tasks is retained and
used to aid future learning over the lifetime of the learner. It is essential
towards the development of intelligent machines that can adapt to their
surroundings. In this work we focus on a lifelong learning approach to
unsupervised generative modeling, where we continuously incorporate newly
observed distributions into a learned model. We do so through a student-teacher
Variational Autoencoder architecture which allows us to learn and preserve all
the distributions seen so far, without the need to retain the past data nor the
past models. Through the introduction of a novel cross-model regularizer,
inspired by a Bayesian update rule, the student model leverages the information
learned by the teacher, which acts as a probabilistic knowledge store. The
regularizer reduces the effect of catastrophic interference that appears when
we learn over sequences of distributions. We validate our model's performance
on sequential variants of MNIST, FashionMNIST, PermutedMNIST, SVHN and Celeb-A
and demonstrate that our model mitigates the effects of catastrophic
interference faced by neural networks in sequential learning scenarios.Comment: 32 page
Basic Enhancement Strategies When Using Bayesian Optimization for Hyperparameter Tuning of Deep Neural Networks
Compared to the traditional machine learning models, deep neural networks (DNN) are known to be highly sensitive to the choice of hyperparameters. While the required time and effort for manual tuning has been rapidly decreasing for the well developed and commonly used DNN architectures, undoubtedly DNN hyperparameter optimization will continue to be a major burden whenever a new DNN architecture needs to be designed, a new task needs to be solved, a new dataset needs to be addressed, or an existing DNN needs to be improved further. For hyperparameter optimization of general machine learning problems, numerous automated solutions have been developed where some of the most popular solutions are based on Bayesian Optimization (BO). In this work, we analyze four fundamental strategies for enhancing BO when it is used for DNN hyperparameter optimization. Specifically, diversification, early termination, parallelization, and cost function transformation are investigated. Based on the analysis, we provide a simple yet robust algorithm for DNN hyperparameter optimization - DEEP-BO (Diversified, Early-termination-Enabled, and Parallel Bayesian Optimization). When evaluated over six DNN benchmarks, DEEP-BO mostly outperformed well-known solutions including GP-Hedge, BOHB, and the speed-up variants that use Median Stopping Rule or Learning Curve Extrapolation. In fact, DEEP-BO consistently provided the top, or at least close to the top, performance over all the benchmark types that we have tested. This indicates that DEEP-BO is a robust solution compared to the existing solutions. The DEEP-BO code is publicly available at <uri>https://github.com/snu-adsl/DEEP-BO</uri>
The role of learning on industrial simulation design and analysis
The capability of modeling real-world system operations has turned simulation into an indispensable problemsolving methodology for business system design and analysis. Today, simulation supports decisions ranging
from sourcing to operations to finance, starting at the strategic level and proceeding towards tactical and
operational levels of decision-making. In such a dynamic setting, the practice of simulation goes beyond
being a static problem-solving exercise and requires integration with learning. This article discusses the role
of learning in simulation design and analysis motivated by the needs of industrial problems and describes
how selected tools of statistical learning can be utilized for this purpose
Online Domain Adaptation for Multi-Object Tracking
Automatically detecting, labeling, and tracking objects in videos depends
first and foremost on accurate category-level object detectors. These might,
however, not always be available in practice, as acquiring high-quality large
scale labeled training datasets is either too costly or impractical for all
possible real-world application scenarios. A scalable solution consists in
re-using object detectors pre-trained on generic datasets. This work is the
first to investigate the problem of on-line domain adaptation of object
detectors for causal multi-object tracking (MOT). We propose to alleviate the
dataset bias by adapting detectors from category to instances, and back: (i) we
jointly learn all target models by adapting them from the pre-trained one, and
(ii) we also adapt the pre-trained model on-line. We introduce an on-line
multi-task learning algorithm to efficiently share parameters and reduce drift,
while gradually improving recall. Our approach is applicable to any linear
object detector, and we evaluate both cheap "mini-Fisher Vectors" and expensive
"off-the-shelf" ConvNet features. We quantitatively measure the benefit of our
domain adaptation strategy on the KITTI tracking benchmark and on a new dataset
(PASCAL-to-KITTI) we introduce to study the domain mismatch problem in MOT.Comment: To appear at BMVC 201
- …