1,213 research outputs found
New distance measures for classifying X-ray astronomy data into stellar classes
The classification of the X-ray sources into classes (such as extragalactic
sources, background stars, ...) is an essential task in astronomy. Typically,
one of the classes corresponds to extragalactic radiation, whose photon
emission behaviour is well characterized by a homogeneous Poisson process. We
propose to use normalized versions of the Wasserstein and Zolotarev distances
to quantify the deviation of the distribution of photon interarrival times from
the exponential class. Our main motivation is the analysis of a massive dataset
from X-ray astronomy obtained by the Chandra Orion Ultradeep Project (COUP).
This project yielded a large catalog of 1616 X-ray cosmic sources in the Orion
Nebula region, with their series of photon arrival times and associated
energies. We consider the plug-in estimators of these metrics, determine their
asymptotic distributions, and illustrate their finite-sample performance with a
Monte Carlo study. We estimate these metrics for each COUP source from three
different classes. We conclude that our proposal provides a striking amount of
information on the nature of the photon emitting sources. Further, these
variables have the ability to identify X-ray sources wrongly catalogued before.
As an appealing conclusion, we show that some sources, previously classified as
extragalactic emissions, have a much higher probability of being young stars in
Orion Nebula.Comment: 29 page
A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant Analysis
Much like the classical Fisher linear discriminant analysis, Wasserstein
discriminant analysis (WDA) is a supervised linear dimensionality reduction
method that seeks a projection matrix to maximize the dispersion of different
data classes and minimize the dispersion of same data classes. However, in
contrast, WDA can account for both global and local inter-connections between
data classes using a regularized Wasserstein distance. WDA is formulated as a
bi-level nonlinear trace ratio optimization. In this paper, we present a
bi-level nonlinear eigenvector (NEPv) algorithm, called WDA-nepv. The inner
kernel of WDA-nepv for computing the optimal transport matrix of the
regularized Wasserstein distance is formulated as an NEPv, and meanwhile the
outer kernel for the trace ratio optimization is also formulated as another
NEPv. Consequently, both kernels can be computed efficiently via
self-consistent-field iterations and modern solvers for linear eigenvalue
problems. Comparing with the existing algorithms for WDA, WDA-nepv is
derivative-free and surrogate-model-free. The computational efficiency and
applications in classification accuracy of WDA-nepv are demonstrated using
synthetic and real-life datasets
A Stable Multi-Scale Kernel for Topological Machine Learning
Topological data analysis offers a rich source of valuable information to
study vision problems. Yet, so far we lack a theoretically sound connection to
popular kernel-based learning techniques, such as kernel SVMs or kernel PCA. In
this work, we establish such a connection by designing a multi-scale kernel for
persistence diagrams, a stable summary representation of topological features
in data. We show that this kernel is positive definite and prove its stability
with respect to the 1-Wasserstein distance. Experiments on two benchmark
datasets for 3D shape classification/retrieval and texture recognition show
considerable performance gains of the proposed method compared to an
alternative approach that is based on the recently introduced persistence
landscapes
- …