7,208 research outputs found
Well-being Forecasting using a Parametric Transfer-Learning method based on the Fisher Divergence and Hamiltonian Monte Carlo
INTRODUCTION: Traditional personalised modelling typically requires sufficient personal data for training. This is a challenge in healthcare contexts, e.g. when using smartphones to predict well-being.
OBJECTIVE: A method to produce incremental patient-specific models and forecasts even in the early stages of data collection when the data are sporadic and limited.
METHODS: We propose a parametric transfer-learning method based on the Fisher divergence, where information from other patients is injected as a prior term into a Hamiltonian Monte Carlo framework. We test our method on the NEVERMIND dataset of self-reported well-being scores.
RESULTS: Out of 54 scenarios representing varying training/forecasting lengths and competing methods, our method achieved overall best performance in 50 (92.6%) and demonstrated a significant median difference in45 (83.3%).
CONCLUSION: The method performs favourably overall, particularly when long-term forecasts are required given short-term data
Lifelong Generative Modeling
Lifelong learning is the problem of learning multiple consecutive tasks in a
sequential manner, where knowledge gained from previous tasks is retained and
used to aid future learning over the lifetime of the learner. It is essential
towards the development of intelligent machines that can adapt to their
surroundings. In this work we focus on a lifelong learning approach to
unsupervised generative modeling, where we continuously incorporate newly
observed distributions into a learned model. We do so through a student-teacher
Variational Autoencoder architecture which allows us to learn and preserve all
the distributions seen so far, without the need to retain the past data nor the
past models. Through the introduction of a novel cross-model regularizer,
inspired by a Bayesian update rule, the student model leverages the information
learned by the teacher, which acts as a probabilistic knowledge store. The
regularizer reduces the effect of catastrophic interference that appears when
we learn over sequences of distributions. We validate our model's performance
on sequential variants of MNIST, FashionMNIST, PermutedMNIST, SVHN and Celeb-A
and demonstrate that our model mitigates the effects of catastrophic
interference faced by neural networks in sequential learning scenarios.Comment: 32 page
Analysis of error propagation in particle filters with approximation
This paper examines the impact of approximation steps that become necessary
when particle filters are implemented on resource-constrained platforms. We
consider particle filters that perform intermittent approximation, either by
subsampling the particles or by generating a parametric approximation. For such
algorithms, we derive time-uniform bounds on the weak-sense error and
present associated exponential inequalities. We motivate the theoretical
analysis by considering the leader node particle filter and present numerical
experiments exploring its performance and the relationship to the error bounds.Comment: Published in at http://dx.doi.org/10.1214/11-AAP760 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Structure Learning in Coupled Dynamical Systems and Dynamic Causal Modelling
Identifying a coupled dynamical system out of many plausible candidates, each
of which could serve as the underlying generator of some observed measurements,
is a profoundly ill posed problem that commonly arises when modelling real
world phenomena. In this review, we detail a set of statistical procedures for
inferring the structure of nonlinear coupled dynamical systems (structure
learning), which has proved useful in neuroscience research. A key focus here
is the comparison of competing models of (ie, hypotheses about) network
architectures and implicit coupling functions in terms of their Bayesian model
evidence. These methods are collectively referred to as dynamical casual
modelling (DCM). We focus on a relatively new approach that is proving
remarkably useful; namely, Bayesian model reduction (BMR), which enables rapid
evaluation and comparison of models that differ in their network architecture.
We illustrate the usefulness of these techniques through modelling
neurovascular coupling (cellular pathways linking neuronal and vascular
systems), whose function is an active focus of research in neurobiology and the
imaging of coupled neuronal systems
Efficient Parametric Approximations of Neural Network Function Space Distance
It is often useful to compactly summarize important properties of model
parameters and training data so that they can be used later without storing
and/or iterating over the entire dataset. As a specific case, we consider
estimating the Function Space Distance (FSD) over a training set, i.e. the
average discrepancy between the outputs of two neural networks. We propose a
Linearized Activation Function TRick (LAFTR) and derive an efficient
approximation to FSD for ReLU neural networks. The key idea is to approximate
the architecture as a linear network with stochastic gating. Despite requiring
only one parameter per unit of the network, our approach outcompetes other
parametric approximations with larger memory requirements. Applied to continual
learning, our parametric approximation is competitive with state-of-the-art
nonparametric approximations, which require storing many training examples.
Furthermore, we show its efficacy in estimating influence functions accurately
and detecting mislabeled examples without expensive iterations over the entire
dataset.Comment: 18 pages, 5 figures, ICML 202
IST Austria Thesis
The human ability to recognize objects in complex scenes has driven research in the computer vision field over couple of decades. This thesis focuses on the object recognition task in images. That is, given the image, we want the computer system to be able to predict the class of the object that appears in the image. A recent successful attempt to bridge semantic understanding of the image perceived by humans and by computers uses attribute-based models. Attributes are semantic properties of the objects shared across different categories, which humans and computers can decide on. To explore the attribute-based models we take a statistical machine learning approach, and address two key learning challenges in view of object recognition task: learning augmented attributes as mid-level discriminative feature representation, and learning with attributes as privileged information. Our main contributions are parametric and non-parametric models and algorithms to solve these frameworks. In the parametric approach, we explore an autoencoder model combined with the large margin nearest neighbor principle for mid-level feature learning, and linear support vector machines for learning with privileged information. In the non-parametric approach, we propose a supervised Indian Buffet Process for automatic augmentation of semantic attributes, and explore the Gaussian Processes classification framework for learning with privileged information. A thorough experimental analysis shows the effectiveness of the proposed models in both parametric and non-parametric views
- …