705 research outputs found
Recommended from our members
Dynamic Machine Learning with Least Square Objectives
As of the writing of this thesis, machine learning has become one of the most active research fields. The interest comes from a variety of disciplines which include computer science, statistics, engineering, and medicine. The main idea behind learning from data is that, when an analytical model explaining the observations is hard to find ---often in contrast to the models in physics such as Newton's laws--- a statistical approach can be taken where one or more candidate models are tuned using data.
Since the early 2000's this challenge has grown in two ways: (i) The amount of collected data has seen a massive growth due to the proliferation of digital media, and (ii) the data has become more complex. One example for the latter is the high dimensional datasets, which can for example correspond to dyadic interactions between two large groups (such as customer and product information a retailer collects), or to high resolution image/video recordings.
Another important issue is the study of dynamic data, which exhibits dependence on time. Virtually all datasets fall into this category as all data collection is performed over time, however I use the term dynamic to hint at a system with an explicit temporal dependence. A traditional example is target tracking from signal processing literature. Here the position of a target is modeled using Newton's laws of motion, which relates it to time via the target's velocity and acceleration.
Dynamic data, as I defined above, poses two important challenges. Firstly, the learning setup is different from the standard theoretical learning setup, also known as Probably Approximately Correct (PAC) learning. To derive PAC learning bounds one assumes a collection of data points sampled independently and identically from a distribution which generates the data. On the other hand, dynamic systems produce correlated outputs. The learning systems we use should accordingly take this difference into consideration. Secondly, as the system is dynamic, it might be necessary to perform the learning online. In this case the learning has to be done in a single pass. Typical applications include target tracking and electricity usage forecasting.
In this thesis I investigate several important dynamic and online learning problems, where I develop novel tools to address the shortcomings of the previous solutions in the literature. The work is divided into three parts for convenience. The first part is about matrix factorization for time series analysis which is further divided into two chapters. In the first chapter, matrix factorization is used within a Bayesian framework to model time-varying dyadic interactions, with examples in predicting user-movie ratings and stock prices. In the next chapter, a matrix factorization which uses autoregressive models to forecast future values of multivariate time series is proposed, with applications in predicting electricity usage and traffic conditions. Inspired by the machinery we use in the first part, the second part is about nonlinear Kalman filtering, where a hidden state is estimated over time given observations. The nonlinearity of the system generating the observations is the main challenge here, where a divergence minimization approach is used to unify the seemingly unrelated methods in the literature, and propose new ones. This has applications in target tracking and options pricing. The third and last part is about cost sensitive learning, where a novel method for maximizing area under receiver operating characteristics curve is proposed. Our method has theoretical guarantees and favorable sample complexity. The method is tested on a variety of benchmark datasets, and also has applications in online advertising
Uncertainty-aware variational inference for target tracking
In the low Earth orbit, target tracking with ground based assets in the context of situational awareness is particularly difficult. Because of the nonlinear state propagation between the moments of measurement arrivals, the inevitably accumulated errors will make the target state prediction and the measurement likelihood inaccurate and uncertain. In this paper, optimizable models with learned parameters are constructed to model the state and measurement prediction uncertainties. A closed-loop variational iterative framework is proposed to jointly achieve parameter inference and state estimation, which comprises an uncertainty-aware variational filter (UnAVF). The theoretical expression of the evidence lower bound and the maximization of the variational lower bound are derived without the need for the true states, which reflect the awareness and reduction of uncertainties. The evidence lower bound can also evaluate the estimation performance of other Gaussian density filters, not only the UnAVF. Moreover, two rules, estimation consistency and lower bound consistency, are proposed to conduct the initialization of hyperparameters. Finally, the superior performance of UnAVF is demonstrated over an orbit state estimation problem
Structure Learning in Coupled Dynamical Systems and Dynamic Causal Modelling
Identifying a coupled dynamical system out of many plausible candidates, each
of which could serve as the underlying generator of some observed measurements,
is a profoundly ill posed problem that commonly arises when modelling real
world phenomena. In this review, we detail a set of statistical procedures for
inferring the structure of nonlinear coupled dynamical systems (structure
learning), which has proved useful in neuroscience research. A key focus here
is the comparison of competing models of (ie, hypotheses about) network
architectures and implicit coupling functions in terms of their Bayesian model
evidence. These methods are collectively referred to as dynamical casual
modelling (DCM). We focus on a relatively new approach that is proving
remarkably useful; namely, Bayesian model reduction (BMR), which enables rapid
evaluation and comparison of models that differ in their network architecture.
We illustrate the usefulness of these techniques through modelling
neurovascular coupling (cellular pathways linking neuronal and vascular
systems), whose function is an active focus of research in neurobiology and the
imaging of coupled neuronal systems
Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains
Latent Gaussian process (GP) models are widely used in neuroscience to
uncover hidden state evolutions from sequential observations, mainly in neural
activity recordings. While latent GP models provide a principled and powerful
solution in theory, the intractable posterior in non-conjugate settings
necessitates approximate inference schemes, which may lack scalability. In this
work, we propose cvHM, a general inference framework for latent GP models
leveraging Hida-Mat\'ern kernels and conjugate computation variational
inference (CVI). With cvHM, we are able to perform variational inference of
latent neural trajectories with linear time complexity for arbitrary
likelihoods. The reparameterization of stationary kernels using Hida-Mat\'ern
GPs helps us connect the latent variable models that encode prior assumptions
through dynamical systems to those that encode trajectory assumptions through
GPs. In contrast to previous work, we use bidirectional information filtering,
leading to a more concise implementation. Furthermore, we employ the Whittle
approximate likelihood to achieve highly efficient hyperparameter learning.Comment: Published at ICML 202
- …