Efficient inference and sampling methods for probabilistic models

Abstract

Probabilistic modeling is the main practical approach for designing systems that learn from data, and has numerous applications in different fields of science and engineering. Given a probabilistic model of a system and its associated parameters, general queries of interest about the system can be answered by performing inference on the distribution. Considering the importance of inference methods for machine learning, our goal in this dissertation is to develop efficient approximate inference methods for some practical problems. For this purpose, the following two problems are considered: First we consider the problem of non-intrusive home-appliance load monitoring which can be modeled as factorial hidden Markov model (FHMM). FHMMs have applications in several fields such as voice recognition, home appliance monitoring, computer vision, and fault detection. By employing FHMMs, the load-disaggregation problem becomes an FHMM inference problem described by a non-convex quadratic integer optimization problem. In this thesis we present a computationally efficient method that finds an approximate solution to this problem by combining convex semidefinite relaxations, randomized rounding, as well as a scalable ADMM method. Simulation results from both synthetic and real-world datasets demonstrate the superiority of our method compared to the previous state-of-the-art solutions. Markov chain Monte Carlo (MCMC) methods are widely used in machine learning. One of the major problems with MCMC is the question of how to design chains that mix fast over the whole state space; in particular, how to select the parameters of an MCMC algorithm. In this thesis we take a different approach and, similarly to parallel MCMC methods, instead of trying to find a single chain that samples from the whole distribution, we combine samples from several chains run in parallel, each exploring only parts of the state space. The chains are prioritized based on the kernel Stein discrepancy. The samples from the independent chains are combined using a novel technique for estimating the probability of different regions of the sample space. Experimental results demonstrate that the proposed algorithm provides significant speed-ups in different sampling problems.Open Acces

    Similar works