Search CORE

3,403 research outputs found

Nonparametric Unsupervised Classification

Author: Huang Thomas S.
Yang Yingzhen
Publication venue
Publication date: 20/05/2013
Field of study

Unsupervised classification methods learn a discriminative classifier from unlabeled data, which has been proven to be an effective way of simultaneously clustering the data and training a classifier from the data. Various unsupervised classification methods obtain appealing results by the classifiers learned in an unsupervised manner. However, existing methods do not consider the misclassification error of the unsupervised classifiers except unsupervised SVM, so the performance of the unsupervised classifiers is not fully evaluated. In this work, we study the misclassification error of two popular classifiers, i.e. the nearest neighbor classifier (NN) and the plug-in classifier, in the setting of unsupervised classification.Comment: Submitted to ALT 201

arXiv.org e-Print Archive

SKYNET: an efficient and robust neural network training tool for machine learning in astronomy

Author: Feroz Farhan
Graff Philip
Hobson Michael P.
Lasenby Anthony N.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 27/01/2014
Field of study

We present the first public release of our generic neural network training algorithm, called SkyNet. This efficient and robust machine learning tool is able to train large and deep feed-forward neural networks, including autoencoders, for use in a wide range of supervised and unsupervised learning applications, such as regression, classification, density estimation, clustering and dimensionality reduction. SkyNet uses a `pre-training' method to obtain a set of network parameters that has empirically been shown to be close to a good solution, followed by further optimisation using a regularised variant of Newton's method, where the level of regularisation is determined and adjusted automatically; the latter uses second-order derivative information to improve convergence, but without the need to evaluate or store the full Hessian matrix, by using a fast approximate method to calculate Hessian-vector products. This combination of methods allows for the training of complicated networks that are difficult to optimise using standard backpropagation techniques. SkyNet employs convergence criteria that naturally prevent overfitting, and also includes a fast algorithm for estimating the accuracy of network outputs. The utility and flexibility of SkyNet are demonstrated by application to a number of toy problems, and to astronomical problems focusing on the recovery of structure from blurred and noisy images, the identification of gamma-ray bursters, and the compression and denoising of galaxy images. The SkyNet software, which is implemented in standard ANSI C and fully parallelised using MPI, is available at http://www.mrao.cam.ac.uk/software/skynet/.Comment: 19 pages, 21 figures, 7 tables; this version is re-submission to MNRAS in response to referee comments; software available at http://www.mrao.cam.ac.uk/software/skynet

arXiv.org e-Print Archive

Model-based clustering via linear cluster-weighted models

Author: Aitken
Andrews
Andrews
Antonio Punzo
Baek
Biernacki
Brent
Böhning
Campbell
Cellini
Chatzis
Cleveland
Dempster
Everitt
Flury
Fraley
Frühwirth-Schnatter
Gershenfeld
Greselin
Hennig
Hubert
Ingrassia
Lange
Leisch
McLachlan
McLachlan
McNicholas
McNicholas
McNicholas
McNicholas
Peel
Salvatore Ingrassia
Schwarz
Shoham
Simona C. Minotti
Titterington
Wand
Wedel
Zellner
Publication venue: 'Elsevier BV'
Publication date: 09/03/2015
Field of study

A novel family of twelve mixture models with random covariates, nested in the linear

t

cluster-weighted model (CWM), is introduced for model-based clustering. The linear

t

CWM was recently presented as a robust alternative to the better known linear Gaussian CWM. The proposed family of models provides a unified framework that also includes the linear Gaussian CWM as a special case. Maximum likelihood parameter estimation is carried out within the EM framework, and both the BIC and the ICL are used for model selection. A simple and effective hierarchical random initialization is also proposed for the EM algorithm. The novel model-based clustering technique is illustrated in some applications to real data. Finally, a simulation study for evaluating the performance of the BIC and the ICL is presented

arXiv.org e-Print Archive

Optimal Spectrum Sensing Policy with Traffic Classification in RF-Powered CRNs

Author: Ahmed Muhammad Ejaz
Kim Dong In
Lee Hae Sol
Publication venue
Publication date: 06/04/2018
Field of study

An orthogonal frequency division multiple access (OFDMA)-based primary user (PU) network is considered, which provides different spectral access/energy harvesting opportunities in RF-powered cognitive radio networks (CRNs). In this scenario, we propose an optimal spectrum sensing policy for opportunistic spectrum access/energy harvesting under both the PU collision and energy causality constraints. PU subchannels can have different traffic patterns and exhibit distinct idle/busy frequencies, due to which the spectral access/energy harvesting opportunities are application specific. Secondary user (SU) collects traffic pattern information through observation of the PU subchannels and classifies the idle/busy period statistics for each subchannel. Based on the statistics, we invoke stochastic models for evaluating SU capacity by which the energy detection threshold for spectrum sensing can be adjusted with higher sensing accuracy. To this end, we employ the Markov decision process (MDP) model obtained by quantizing the amount of SU battery and the duty cycle model obtained by the ratio of average harvested energy and energy consumption rates. We demonstrate the effectiveness of the proposed stochastic models through comparison with the optimal one obtained from an exhaustive method.Comment: 14 pages, 12 figure

arXiv.org e-Print Archive

On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms

Author: Kaplan Zachary
Samakhoana Thabo
Sanz-Alonso Daniel
Trillos Nicolas Garcia
Publication venue
Publication date: 12/01/2020
Field of study

A popular approach to semi-supervised learning proceeds by endowing the input data with a graph structure in order to extract geometric information and incorporate it into a Bayesian framework. We introduce new theory that gives appropriate scalings of graph parameters that provably lead to a well-defined limiting posterior as the size of the unlabeled data set grows. Furthermore, we show that these consistency results have profound algorithmic implications. When consistency holds, carefully designed graph-based Markov chain Monte Carlo algorithms are proved to have a uniform spectral gap, independent of the number of unlabeled inputs. Several numerical experiments corroborate both the statistical consistency and the algorithmic scalability established by the theory

arXiv.org e-Print Archive

Mind the nuisance: Gaussian process classification using privileged noise

Author: Hernández-lobato Daniel
Kersting Kristian
Lampert Christoph H
Quadrianto Novi
Sharmanska Viktoriia
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/01/2014
Field of study

The learning with privileged information setting has recently attracted a lot of attention within the machine learning community, as it allows the integration of additional knowledge into the training process of a classifier, even when this comes in the form of a data modality that is not available at test time. Here, we show that privileged information can naturally be treated as noise in the latent function of a Gaussian process classifier (GPC). That is, in contrast to the standard GPC setting, the latent function is not just a nuisance but a feature: it becomes a natural measure of confidence about the training data by modulating the slope of the GPC probit likelihood function. Extensive experiments on public datasets show that the proposed GPC method using privileged noise, called GPC+, improves over a standard GPC without privileged knowledge, and also over the current state-of-the-art SVM-based method, SVM+. Moreover, we show that advanced neural networks and deep learning methods can be compressed as privileged information

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Machine learning in acoustics: theory and applications

Author: Bianco Michael J.
Deledalle Charles-Alban
Gannot Sharon
Gerstoft Peter
Ozanich Emma
Roch Marie A.
Traer James
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/12/2019
Field of study

Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and transformative potential of machine learning (ML), including deep learning, in the field of acoustics. ML is a broad family of techniques, which are often based in statistics, for automatically detecting and utilizing patterns in data. Relative to conventional acoustics and signal processing, ML is data-driven. Given sufficient training data, ML can discover complex relationships between features and desired labels or actions, or between features themselves. With large volumes of training data, ML can discover models describing complex acoustic phenomena such as human speech and reverberation. ML in acoustics is rapidly developing with compelling results and significant future promise. We first introduce ML, then highlight ML developments in four acoustics research areas: source localization in speech processing, source localization in ocean acoustics, bioacoustics, and environmental sounds in everyday scenes.Comment: Published with free access in Journal of the Acoustical Society of America, 27 Nov. 201

arXiv.org e-Print Archive

A bi-partite generative model framework for analyzing and simulating large scale multiple discrete-continuous travel behaviour data

Author: Farooq Bilal
Wong Melvin
Publication venue: 'Elsevier BV'
Publication date: 08/05/2020
Field of study

The emergence of data-driven demand analysis has led to the increased use of generative modelling to learn the probabilistic dependencies between random variables. Although their apparent use has mostly been limited to image recognition and classification in recent years, generative machine learning algorithms can be a powerful tool for travel behaviour research by replicating travel behaviour by the underlying properties of data structures. In this paper, we examine the use of generative machine learning approach for analyzing multiple discrete-continuous (MDC) travel behaviour data. We provide a plausible perspective of how we can exploit the use of machine learning techniques to interpret the underlying heterogeneities in the data. We show that generative models are conceptually similar to the choice selection behaviour process through information entropy and variational Bayesian inference. Without loss of generality, we consider a restricted Boltzmann machine (RBM) based algorithm with multiple discrete-continuous layers, formulated as a variational Bayesian inference optimization problem. We systematically describe the proposed machine learning algorithm and develop a process of analyzing travel behaviour data from a generative learning perspective. We show parameter stability from model analysis and simulation tests on an open dataset with multiple discrete-continuous dimensions from a data size of 293,330 observations. For interpretability, we derive the conditional probabilities, elasticities and perform statistical analysis on the latent variables. We show that our model can generate statistically similar data distributions for travel forecasting and prediction and performs better than purely discriminative methods in validation. Our results indicate that latent constructs in generative models can accurately represent the joint distribution consistently on MDC data

arXiv.org e-Print Archive

Parsimonious Topic Models with Salient Word Discovery

Author: Miller David J.
Soleimani Hossein
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/09/2014
Field of study

We propose a parsimonious topic model for text corpora. In related models such as Latent Dirichlet Allocation (LDA), all words are modeled topic-specifically, even though many words occur with similar frequencies across different topics. Our modeling determines salient words for each topic, which have topic-specific probabilities, with the rest explained by a universal shared model. Further, in LDA all topics are in principle present in every document. By contrast our model gives sparse topic representation, determining the (small) subset of relevant topics for each document. We derive a Bayesian Information Criterion (BIC), balancing model complexity and goodness of fit. Here, interestingly, we identify an effective sample size and corresponding penalty specific to each parameter type in our model. We minimize BIC to jointly determine our entire model -- the topic-specific words, document-specific topics, all model parameter values, {\it and} the total number of topics -- in a wholly unsupervised fashion. Results on three text corpora and an image dataset show that our model achieves higher test set likelihood and better agreement with ground-truth class labels, compared to LDA and to a model designed to incorporate sparsity

arXiv.org e-Print Archive

Application of Bayes' theorem for pulse shape discrimination

Author: Clarke Shaun
Marleau Peter
Monterial Mateusz
Pozzi Sara
Publication venue: 'Elsevier BV'
Publication date: 02/03/2017
Field of study

A Bayesian approach is proposed for pulse shape discrimination of photons and neutrons in liquid organic scinitillators. Instead of drawing a decision boundary, each pulse is assigned a photon or neutron confidence probability. This allows for photon and neutron classification on an event-by-event basis. The sum of those confidence probabilities is used to estimate the number of photon and neutron instances in the data. An iterative scheme, similar to an expectation-maximization algorithm for Gaussian mixtures, is used to infer the ratio of photons-to-neutrons in each measurement. Therefore, the probability space adapts to data with varying photon-to-neutron ratios. A time-correlated measurement of Am-Be and separate measurements of

^{137}

Cs,

^{60}

Co and

^{232}

Th photon sources were used to construct libraries of neutrons and photons. These libraries were then used to produce synthetic data sets with varying ratios of photons-to-neutrons. Probability weighted method that we implemented was found to maintain neutron acceptance rate of up to 90% up to photon-to-neutron ratio of 2000, and performed 9% better than decision boundary approach. Furthermore, the iterative approach appropriately changed the probability space with an increasing number of photons which kept the neutron population estimate from unrealistically increasing.Comment: 8 pages, 9 figure

arXiv.org e-Print Archive