16,756 research outputs found
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
Deep learning for time series classification: a review
Time Series Classification (TSC) is an important and challenging problem in
data mining. With the increase of time series data availability, hundreds of
TSC algorithms have been proposed. Among these methods, only a few have
considered Deep Neural Networks (DNNs) to perform this task. This is surprising
as deep learning has seen very successful applications in the last years. DNNs
have indeed revolutionized the field of computer vision especially with the
advent of novel deeper architectures such as Residual and Convolutional Neural
Networks. Apart from images, sequential data such as text and audio can also be
processed with DNNs to reach state-of-the-art performance for document
classification and speech recognition. In this article, we study the current
state-of-the-art performance of deep learning algorithms for TSC by presenting
an empirical study of the most recent DNN architectures for TSC. We give an
overview of the most successful deep learning applications in various time
series domains under a unified taxonomy of DNNs for TSC. We also provide an
open source deep learning framework to the TSC community where we implemented
each of the compared approaches and evaluated them on a univariate TSC
benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By
training 8,730 deep learning models on 97 time series datasets, we propose the
most exhaustive study of DNNs for TSC to date.Comment: Accepted at Data Mining and Knowledge Discover
The pseudotemporal bootstrap for predicting glaucoma from cross-sectional visual field data
Progressive loss of the field of vision is characteristic of a number of eye diseases such as glaucoma, a leading cause of irreversible blindness in the world. Recently, there has been an explosion in the amount of data being stored on patients who suffer from visual deterioration, including visual field (VF) test, retinal image, and frequent intraocular pressure measurements. Like the progression of many biological and medical processes, VF progression is inherently temporal in nature. However, many datasets associated with the study of such processes are often cross sectional and the time dimension is not measured due to the expensive nature of such studies. In this paper, we address this issue by developing a method to build artificial time series, which we call pseudo time series from cross-sectional data. This involves building trajectories through all of the data that can then, in turn, be used to build temporal models for forecasting (which would otherwise be impossible without longitudinal data). Glaucoma, like many diseases, is a family of conditions and it is, therefore, likely that there will be a number of key trajectories that are important in understanding the disease. In order to deal with such situations, we extend the idea of pseudo time series by using resampling techniques to build multiple sequences prior to model building. This approach naturally handles outliers and multiple possible disease trajectories. We demonstrate some key properties of our approach on synthetic data and present very promising results on VF data for predicting glaucoma
Clustering Method for Time-Series Images Using Quantum-Inspired Computing Technology
Time-series clustering serves as a powerful data mining technique for
time-series data in the absence of prior knowledge about clusters. A large
amount of time-series data with large size has been acquired and used in
various research fields. Hence, clustering method with low computational cost
is required. Given that a quantum-inspired computing technology, such as a
simulated annealing machine, surpasses conventional computers in terms of fast
and accurately solving combinatorial optimization problems, it holds promise
for accomplishing clustering tasks that are challenging to achieve using
existing methods. This study proposes a novel time-series clustering method
that leverages an annealing machine. The proposed method facilitates an even
classification of time-series data into clusters close to each other while
maintaining robustness against outliers. Moreover, its applicability extends to
time-series images. We compared the proposed method with a standard existing
method for clustering an online distributed dataset. In the existing method,
the distances between each data are calculated based on the Euclidean distance
metric, and the clustering is performed using the k-means++ method. We found
that both methods yielded comparable results. Furthermore, the proposed method
was applied to a flow measurement image dataset containing noticeable noise
with a signal-to-noise ratio of approximately 1. Despite a small signal
variation of approximately 2%, the proposed method effectively classified the
data without any overlap among the clusters. In contrast, the clustering
results by the standard existing method and the conditional image sampling
(CIS) method, a specialized technique for flow measurement data, displayed
overlapping clusters. Consequently, the proposed method provides better results
than the other two methods, demonstrating its potential as a superior
clustering method.Comment: 13 pages, 4 figure
- …