3,403 research outputs found
Nonparametric Unsupervised Classification
Unsupervised classification methods learn a discriminative classifier from
unlabeled data, which has been proven to be an effective way of simultaneously
clustering the data and training a classifier from the data. Various
unsupervised classification methods obtain appealing results by the classifiers
learned in an unsupervised manner. However, existing methods do not consider
the misclassification error of the unsupervised classifiers except unsupervised
SVM, so the performance of the unsupervised classifiers is not fully evaluated.
In this work, we study the misclassification error of two popular classifiers,
i.e. the nearest neighbor classifier (NN) and the plug-in classifier, in the
setting of unsupervised classification.Comment: Submitted to ALT 201
SKYNET: an efficient and robust neural network training tool for machine learning in astronomy
We present the first public release of our generic neural network training
algorithm, called SkyNet. This efficient and robust machine learning tool is
able to train large and deep feed-forward neural networks, including
autoencoders, for use in a wide range of supervised and unsupervised learning
applications, such as regression, classification, density estimation,
clustering and dimensionality reduction. SkyNet uses a `pre-training' method to
obtain a set of network parameters that has empirically been shown to be close
to a good solution, followed by further optimisation using a regularised
variant of Newton's method, where the level of regularisation is determined and
adjusted automatically; the latter uses second-order derivative information to
improve convergence, but without the need to evaluate or store the full Hessian
matrix, by using a fast approximate method to calculate Hessian-vector
products. This combination of methods allows for the training of complicated
networks that are difficult to optimise using standard backpropagation
techniques. SkyNet employs convergence criteria that naturally prevent
overfitting, and also includes a fast algorithm for estimating the accuracy of
network outputs. The utility and flexibility of SkyNet are demonstrated by
application to a number of toy problems, and to astronomical problems focusing
on the recovery of structure from blurred and noisy images, the identification
of gamma-ray bursters, and the compression and denoising of galaxy images. The
SkyNet software, which is implemented in standard ANSI C and fully parallelised
using MPI, is available at http://www.mrao.cam.ac.uk/software/skynet/.Comment: 19 pages, 21 figures, 7 tables; this version is re-submission to
MNRAS in response to referee comments; software available at
http://www.mrao.cam.ac.uk/software/skynet
Model-based clustering via linear cluster-weighted models
A novel family of twelve mixture models with random covariates, nested in the
linear cluster-weighted model (CWM), is introduced for model-based
clustering. The linear CWM was recently presented as a robust alternative
to the better known linear Gaussian CWM. The proposed family of models provides
a unified framework that also includes the linear Gaussian CWM as a special
case. Maximum likelihood parameter estimation is carried out within the EM
framework, and both the BIC and the ICL are used for model selection. A simple
and effective hierarchical random initialization is also proposed for the EM
algorithm. The novel model-based clustering technique is illustrated in some
applications to real data. Finally, a simulation study for evaluating the
performance of the BIC and the ICL is presented
Optimal Spectrum Sensing Policy with Traffic Classification in RF-Powered CRNs
An orthogonal frequency division multiple access (OFDMA)-based primary user
(PU) network is considered, which provides different spectral access/energy
harvesting opportunities in RF-powered cognitive radio networks (CRNs). In this
scenario, we propose an optimal spectrum sensing policy for opportunistic
spectrum access/energy harvesting under both the PU collision and energy
causality constraints. PU subchannels can have different traffic patterns and
exhibit distinct idle/busy frequencies, due to which the spectral access/energy
harvesting opportunities are application specific. Secondary user (SU) collects
traffic pattern information through observation of the PU subchannels and
classifies the idle/busy period statistics for each subchannel. Based on the
statistics, we invoke stochastic models for evaluating SU capacity by which the
energy detection threshold for spectrum sensing can be adjusted with higher
sensing accuracy. To this end, we employ the Markov decision process (MDP)
model obtained by quantizing the amount of SU battery and the duty cycle model
obtained by the ratio of average harvested energy and energy consumption rates.
We demonstrate the effectiveness of the proposed stochastic models through
comparison with the optimal one obtained from an exhaustive method.Comment: 14 pages, 12 figure
On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms
A popular approach to semi-supervised learning proceeds by endowing the input
data with a graph structure in order to extract geometric information and
incorporate it into a Bayesian framework. We introduce new theory that gives
appropriate scalings of graph parameters that provably lead to a well-defined
limiting posterior as the size of the unlabeled data set grows. Furthermore, we
show that these consistency results have profound algorithmic implications.
When consistency holds, carefully designed graph-based Markov chain Monte Carlo
algorithms are proved to have a uniform spectral gap, independent of the number
of unlabeled inputs. Several numerical experiments corroborate both the
statistical consistency and the algorithmic scalability established by the
theory
Mind the nuisance: Gaussian process classification using privileged noise
The learning with privileged information setting has recently attracted a lot of attention within the machine learning community, as it allows the integration of additional knowledge into the training process of a classifier, even when this comes in the form of a data modality that is not available at test time. Here, we show that privileged information can naturally be treated as noise in the latent function of a Gaussian process classifier (GPC). That is, in contrast to the standard GPC setting, the latent function is not just a nuisance but a feature: it becomes a natural measure of confidence about the training data by modulating the slope of the GPC probit likelihood function. Extensive experiments on public datasets show that the proposed GPC method using privileged noise, called GPC+, improves over a standard GPC without privileged knowledge, and also over the current state-of-the-art SVM-based method, SVM+. Moreover, we show that advanced neural networks and deep learning methods can be compressed as privileged information
Machine learning in acoustics: theory and applications
Acoustic data provide scientific and engineering insights in fields ranging
from biology and communications to ocean and Earth science. We survey the
recent advances and transformative potential of machine learning (ML),
including deep learning, in the field of acoustics. ML is a broad family of
techniques, which are often based in statistics, for automatically detecting
and utilizing patterns in data. Relative to conventional acoustics and signal
processing, ML is data-driven. Given sufficient training data, ML can discover
complex relationships between features and desired labels or actions, or
between features themselves. With large volumes of training data, ML can
discover models describing complex acoustic phenomena such as human speech and
reverberation. ML in acoustics is rapidly developing with compelling results
and significant future promise. We first introduce ML, then highlight ML
developments in four acoustics research areas: source localization in speech
processing, source localization in ocean acoustics, bioacoustics, and
environmental sounds in everyday scenes.Comment: Published with free access in Journal of the Acoustical Society of
America, 27 Nov. 201
A bi-partite generative model framework for analyzing and simulating large scale multiple discrete-continuous travel behaviour data
The emergence of data-driven demand analysis has led to the increased use of
generative modelling to learn the probabilistic dependencies between random
variables. Although their apparent use has mostly been limited to image
recognition and classification in recent years, generative machine learning
algorithms can be a powerful tool for travel behaviour research by replicating
travel behaviour by the underlying properties of data structures. In this
paper, we examine the use of generative machine learning approach for analyzing
multiple discrete-continuous (MDC) travel behaviour data. We provide a
plausible perspective of how we can exploit the use of machine learning
techniques to interpret the underlying heterogeneities in the data. We show
that generative models are conceptually similar to the choice selection
behaviour process through information entropy and variational Bayesian
inference. Without loss of generality, we consider a restricted Boltzmann
machine (RBM) based algorithm with multiple discrete-continuous layers,
formulated as a variational Bayesian inference optimization problem. We
systematically describe the proposed machine learning algorithm and develop a
process of analyzing travel behaviour data from a generative learning
perspective. We show parameter stability from model analysis and simulation
tests on an open dataset with multiple discrete-continuous dimensions from a
data size of 293,330 observations. For interpretability, we derive the
conditional probabilities, elasticities and perform statistical analysis on the
latent variables. We show that our model can generate statistically similar
data distributions for travel forecasting and prediction and performs better
than purely discriminative methods in validation. Our results indicate that
latent constructs in generative models can accurately represent the joint
distribution consistently on MDC data
Parsimonious Topic Models with Salient Word Discovery
We propose a parsimonious topic model for text corpora. In related models
such as Latent Dirichlet Allocation (LDA), all words are modeled
topic-specifically, even though many words occur with similar frequencies
across different topics. Our modeling determines salient words for each topic,
which have topic-specific probabilities, with the rest explained by a universal
shared model. Further, in LDA all topics are in principle present in every
document. By contrast our model gives sparse topic representation, determining
the (small) subset of relevant topics for each document. We derive a Bayesian
Information Criterion (BIC), balancing model complexity and goodness of fit.
Here, interestingly, we identify an effective sample size and corresponding
penalty specific to each parameter type in our model. We minimize BIC to
jointly determine our entire model -- the topic-specific words,
document-specific topics, all model parameter values, {\it and} the total
number of topics -- in a wholly unsupervised fashion. Results on three text
corpora and an image dataset show that our model achieves higher test set
likelihood and better agreement with ground-truth class labels, compared to LDA
and to a model designed to incorporate sparsity
Application of Bayes' theorem for pulse shape discrimination
A Bayesian approach is proposed for pulse shape discrimination of photons and
neutrons in liquid organic scinitillators. Instead of drawing a decision
boundary, each pulse is assigned a photon or neutron confidence probability.
This allows for photon and neutron classification on an event-by-event basis.
The sum of those confidence probabilities is used to estimate the number of
photon and neutron instances in the data. An iterative scheme, similar to an
expectation-maximization algorithm for Gaussian mixtures, is used to infer the
ratio of photons-to-neutrons in each measurement. Therefore, the probability
space adapts to data with varying photon-to-neutron ratios. A time-correlated
measurement of Am-Be and separate measurements of Cs, Co and
Th photon sources were used to construct libraries of neutrons and
photons. These libraries were then used to produce synthetic data sets with
varying ratios of photons-to-neutrons. Probability weighted method that we
implemented was found to maintain neutron acceptance rate of up to 90% up to
photon-to-neutron ratio of 2000, and performed 9% better than decision boundary
approach. Furthermore, the iterative approach appropriately changed the
probability space with an increasing number of photons which kept the neutron
population estimate from unrealistically increasing.Comment: 8 pages, 9 figure
- …