1,833 research outputs found
Bayesian non-parametric models for time segmentation and regression
University of Technology Sydney. Faculty of Engineering and Information Technology.Non-parametric Bayesian modelling offers a principled way for avoiding model selection such as pre-defining the number of modes in a mixture model or the optimal number of factors in factor analysis. Instead, Bayesian non-parametric methods allow the data to determine the complexity of model. In particular, the hierarchical Dirichlet process (HDP) is used in a variety of applications to infer an arbitrary number of classes from a set of samples. Within the temporal modelling paradigm, Bayesian non-parametrics is used to model sequential data by integrating HDP priors into state-space models such as HMM, constructing HDP-HMM. Also in latent factor modelling and dimensionality reduction, Indian buffet process (IBP) is a well-known method capable of sparse modelling and selecting an arbitrary number of factors among the often high-dimensional features.
In this PhD thesis, we have applied the above methods to propose novel solutions to two prominent problems. The first model, named as ‘AdOn HDP-HMM’, is an adaptive online system based on HDP-HMM. ‘AdOn HDP-HMM’ is capable of segmenting and classifying the sequential data over unlimited number of classes, while meeting the memory and delay constraints of streaming contexts. The model is further enhanced by a number of learning rates, responsible for tuning the adaptability by determining the extent to which the model sustains its previous parameters or adapts to the new data. Empirical results on several variants of synthetic and action recognition data, show remarkable performance, particularly using adaptive learning rates for evolutionary sequences.
The second proposed solution is an elaborate factor regression model, named as non-parametric conditional factor regression (NCFR), to cater for multivariate prediction, preserving the correlations in the response layer. NCFR enhances factor regression by integrating IBP to infer the optimal number of latent factors, in a sparse model. Thanks to this data-driven approach, NCFR can significantly avoid over-fitting even in cases where the ratio between the number of available samples and dimensions is very low. Experimental results on three diverse datasets give evidence of its remarkable predictive performance, resilience to over-fitting, good mixing and computational efficiency
Recommended from our members
A Large-Scale Study of Modern Code Review and Security in Open Source Projects.
Taming Resource Heterogeneity In Distributed ML Training With Dynamic Batching
Current techniques and systems for distributed model training mostly assume
that clusters are comprised of homogeneous servers with a constant resource
availability. However, cluster heterogeneity is pervasive in computing
infrastructure, and is a fundamental characteristic of low-cost transient
resources (such as EC2 spot instances). In this paper, we develop a dynamic
batching technique for distributed data-parallel training that adjusts the
mini-batch sizes on each worker based on its resource availability and
throughput. Our mini-batch controller seeks to equalize iteration times on all
workers, and facilitates training on clusters comprised of servers with
different amounts of CPU and GPU resources. This variable mini-batch technique
uses proportional control and ideas from PID controllers to find stable
mini-batch sizes. Our empirical evaluation shows that dynamic batching can
reduce model training times by more than 4x on heterogeneous clusters
Information handling: Concepts which emerged in practical situations and are analysed cybernetically
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University
Recommended from our members
The Design and Implementation of Low-Latency Prediction Serving Systems
Machine learning is being deployed in a growing number of applications which demand real- time, accurate, and cost-efficient predictions under heavy query load. These applications employ a variety of machine learning frameworks and models, often composing several models within the same application. However, most machine learning frameworks and systems are optimized for model training and not deployment.In this thesis, I discuss three prediction serving systems designed to meet the needs of modern interactive machine learning applications. The key idea in this work is to utilize a decoupled, layered design that interposes systems on top of training frameworks to build low-latency, scalable serving systems. Velox introduced this decoupled architecture to enable fast online learning and model personalization in response to feedback. Clipper generalized this system architecture to be framework-agnostic and introduced a set of optimizations to reduce and bound prediction latency and improve prediction throughput, accuracy, and robustness without modifying the underlying machine learning frameworks. And InferLine provisions and manages the individual stages of prediction pipelines to minimize cost while meeting end-to-end tail latency constraints
- …