750 research outputs found

    Particle Learning for General Mixtures

    Get PDF
    This paper develops particle learning (PL) methods for the estimation of general mixture models. The approach is distinguished from alternative particle filtering methods in two major ways. First, each iteration begins by resampling particles according to posterior predictive probability, leading to a more efficient set for propagation. Second, each particle tracks only the "essential state vector" thus leading to reduced dimensional inference. In addition, we describe how the approach will apply to more general mixture models of current interest in the literature; it is hoped that this will inspire a greater number of researchers to adopt sequential Monte Carlo methods for fitting their sophisticated mixture based models. Finally, we show that PL leads to straight forward tools for marginal likelihood calculation and posterior cluster allocation.Business Administratio

    Dynamic Machine Learning with Least Square Objectives

    Get PDF
    As of the writing of this thesis, machine learning has become one of the most active research fields. The interest comes from a variety of disciplines which include computer science, statistics, engineering, and medicine. The main idea behind learning from data is that, when an analytical model explaining the observations is hard to find ---often in contrast to the models in physics such as Newton's laws--- a statistical approach can be taken where one or more candidate models are tuned using data. Since the early 2000's this challenge has grown in two ways: (i) The amount of collected data has seen a massive growth due to the proliferation of digital media, and (ii) the data has become more complex. One example for the latter is the high dimensional datasets, which can for example correspond to dyadic interactions between two large groups (such as customer and product information a retailer collects), or to high resolution image/video recordings. Another important issue is the study of dynamic data, which exhibits dependence on time. Virtually all datasets fall into this category as all data collection is performed over time, however I use the term dynamic to hint at a system with an explicit temporal dependence. A traditional example is target tracking from signal processing literature. Here the position of a target is modeled using Newton's laws of motion, which relates it to time via the target's velocity and acceleration. Dynamic data, as I defined above, poses two important challenges. Firstly, the learning setup is different from the standard theoretical learning setup, also known as Probably Approximately Correct (PAC) learning. To derive PAC learning bounds one assumes a collection of data points sampled independently and identically from a distribution which generates the data. On the other hand, dynamic systems produce correlated outputs. The learning systems we use should accordingly take this difference into consideration. Secondly, as the system is dynamic, it might be necessary to perform the learning online. In this case the learning has to be done in a single pass. Typical applications include target tracking and electricity usage forecasting. In this thesis I investigate several important dynamic and online learning problems, where I develop novel tools to address the shortcomings of the previous solutions in the literature. The work is divided into three parts for convenience. The first part is about matrix factorization for time series analysis which is further divided into two chapters. In the first chapter, matrix factorization is used within a Bayesian framework to model time-varying dyadic interactions, with examples in predicting user-movie ratings and stock prices. In the next chapter, a matrix factorization which uses autoregressive models to forecast future values of multivariate time series is proposed, with applications in predicting electricity usage and traffic conditions. Inspired by the machinery we use in the first part, the second part is about nonlinear Kalman filtering, where a hidden state is estimated over time given observations. The nonlinearity of the system generating the observations is the main challenge here, where a divergence minimization approach is used to unify the seemingly unrelated methods in the literature, and propose new ones. This has applications in target tracking and options pricing. The third and last part is about cost sensitive learning, where a novel method for maximizing area under receiver operating characteristics curve is proposed. Our method has theoretical guarantees and favorable sample complexity. The method is tested on a variety of benchmark datasets, and also has applications in online advertising

    A New Perspective and Extension of the Gaussian Filter

    Full text link
    The Gaussian Filter (GF) is one of the most widely used filtering algorithms; instances are the Extended Kalman Filter, the Unscented Kalman Filter and the Divided Difference Filter. GFs represent the belief of the current state by a Gaussian with the mean being an affine function of the measurement. We show that this representation can be too restrictive to accurately capture the dependences in systems with nonlinear observation models, and we investigate how the GF can be generalized to alleviate this problem. To this end, we view the GF from a variational-inference perspective. We analyse how restrictions on the form of the belief can be relaxed while maintaining simplicity and efficiency. This analysis provides a basis for generalizations of the GF. We propose one such generalization which coincides with a GF using a virtual measurement, obtained by applying a nonlinear function to the actual measurement. Numerical experiments show that the proposed Feature Gaussian Filter (FGF) can have a substantial performance advantage over the standard GF for systems with nonlinear observation models.Comment: Will appear in Robotics: Science and Systems (R:SS) 201

    The Infinite Latent Events Model

    Get PDF
    We present the Infinite Latent Events Model, a nonparametric hierarchical Bayesian distribution over infinite dimensional Dynamic Bayesian Networks with binary state representations and noisy-OR-like transitions. The distribution can be used to learn structure in discrete timeseries data by simultaneously inferring a set of latent events, which events fired at each timestep, and how those events are causally linked. We illustrate the model on a sound factorization task, a network topology identification task, and a video game task.NTT Communication Science LaboratoriesUnited States. Air Force Office of Scientific Research (AFOSR FA9550-07-1-0075)United States. Office of Naval Research (ONR N00014-07-1-0937)National Science Foundation (U.S.) (Graduate Research Fellowship)United States. Army Research Office (ARO W911NF-08-1-0242)James S. McDonnell Foundation (Causal Learning Collaborative Initiative
    corecore