7,127 research outputs found
AutoEncoder by Forest
Auto-encoding is an important task which is typically realized by deep neural
networks (DNNs) such as convolutional neural networks (CNN). In this paper, we
propose EncoderForest (abbrv. eForest), the first tree ensemble based
auto-encoder. We present a procedure for enabling forests to do backward
reconstruction by utilizing the equivalent classes defined by decision paths of
the trees, and demonstrate its usage in both supervised and unsupervised
setting. Experiments show that, compared with DNN autoencoders, eForest is able
to obtain lower reconstruction error with fast training speed, while the model
itself is reusable and damage-tolerable
A recurrent neural network for classification of unevenly sampled variable stars
Astronomical surveys of celestial sources produce streams of noisy time
series measuring flux versus time ("light curves"). Unlike in many other
physical domains, however, large (and source-specific) temporal gaps in data
arise naturally due to intranight cadence choices as well as diurnal and
seasonal constraints. With nightly observations of millions of variable stars
and transients from upcoming surveys, efficient and accurate discovery and
classification techniques on noisy, irregularly sampled data must be employed
with minimal human-in-the-loop involvement. Machine learning for inference
tasks on such data traditionally requires the laborious hand-coding of
domain-specific numerical summaries of raw data ("features"). Here we present a
novel unsupervised autoencoding recurrent neural network (RNN) that makes
explicit use of sampling times and known heteroskedastic noise properties. When
trained on optical variable star catalogs, this network produces supervised
classification models that rival other best-in-class approaches. We find that
autoencoded features learned on one time-domain survey perform nearly as well
when applied to another survey. These networks can continue to learn from new
unlabeled observations and may be used in other unsupervised tasks such as
forecasting and anomaly detection.Comment: 23 pages, 14 figures. The published version is at Nature Astronomy
(https://www.nature.com/articles/s41550-017-0321-z). Source code for models,
experiments, and figures at
https://github.com/bnaul/IrregularTimeSeriesAutoencoderPaper (Zenodo Code
DOI: 10.5281/zenodo.1045560
Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features
One-class support vector machine (OC-SVM) for a long time has been one of the
most effective anomaly detection methods and extensively adopted in both
research as well as industrial applications. The biggest issue for OC-SVM is
yet the capability to operate with large and high-dimensional datasets due to
optimization complexity. Those problems might be mitigated via dimensionality
reduction techniques such as manifold learning or autoencoder. However,
previous work often treats representation learning and anomaly prediction
separately. In this paper, we propose autoencoder based one-class support
vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier
features to approximate the radial basis kernel, into deep learning context by
combining it with a representation learning architecture and jointly exploit
stochastic gradient descent to obtain end-to-end training. Interestingly, this
also opens up the possible use of gradient-based attribution methods to explain
the decision making for anomaly detection, which has ever been challenging as a
result of the implicit mappings between the input space and the kernel space.
To the best of our knowledge, this is the first work to study the
interpretability of deep learning in anomaly detection. We evaluate our method
on a wide range of unsupervised anomaly detection tasks in which our end-to-end
training architecture achieves a performance significantly better than the
previous work using separate training.Comment: Accepted at European Conference on Machine Learning and Principles
and Practice of Knowledge Discovery in Databases (ECML-PKDD) 201
- …