1,213 research outputs found

    New distance measures for classifying X-ray astronomy data into stellar classes

    Full text link
    The classification of the X-ray sources into classes (such as extragalactic sources, background stars, ...) is an essential task in astronomy. Typically, one of the classes corresponds to extragalactic radiation, whose photon emission behaviour is well characterized by a homogeneous Poisson process. We propose to use normalized versions of the Wasserstein and Zolotarev distances to quantify the deviation of the distribution of photon interarrival times from the exponential class. Our main motivation is the analysis of a massive dataset from X-ray astronomy obtained by the Chandra Orion Ultradeep Project (COUP). This project yielded a large catalog of 1616 X-ray cosmic sources in the Orion Nebula region, with their series of photon arrival times and associated energies. We consider the plug-in estimators of these metrics, determine their asymptotic distributions, and illustrate their finite-sample performance with a Monte Carlo study. We estimate these metrics for each COUP source from three different classes. We conclude that our proposal provides a striking amount of information on the nature of the photon emitting sources. Further, these variables have the ability to identify X-ray sources wrongly catalogued before. As an appealing conclusion, we show that some sources, previously classified as extragalactic emissions, have a much higher probability of being young stars in Orion Nebula.Comment: 29 page

    A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant Analysis

    Full text link
    Much like the classical Fisher linear discriminant analysis, Wasserstein discriminant analysis (WDA) is a supervised linear dimensionality reduction method that seeks a projection matrix to maximize the dispersion of different data classes and minimize the dispersion of same data classes. However, in contrast, WDA can account for both global and local inter-connections between data classes using a regularized Wasserstein distance. WDA is formulated as a bi-level nonlinear trace ratio optimization. In this paper, we present a bi-level nonlinear eigenvector (NEPv) algorithm, called WDA-nepv. The inner kernel of WDA-nepv for computing the optimal transport matrix of the regularized Wasserstein distance is formulated as an NEPv, and meanwhile the outer kernel for the trace ratio optimization is also formulated as another NEPv. Consequently, both kernels can be computed efficiently via self-consistent-field iterations and modern solvers for linear eigenvalue problems. Comparing with the existing algorithms for WDA, WDA-nepv is derivative-free and surrogate-model-free. The computational efficiency and applications in classification accuracy of WDA-nepv are demonstrated using synthetic and real-life datasets

    A Stable Multi-Scale Kernel for Topological Machine Learning

    Full text link
    Topological data analysis offers a rich source of valuable information to study vision problems. Yet, so far we lack a theoretically sound connection to popular kernel-based learning techniques, such as kernel SVMs or kernel PCA. In this work, we establish such a connection by designing a multi-scale kernel for persistence diagrams, a stable summary representation of topological features in data. We show that this kernel is positive definite and prove its stability with respect to the 1-Wasserstein distance. Experiments on two benchmark datasets for 3D shape classification/retrieval and texture recognition show considerable performance gains of the proposed method compared to an alternative approach that is based on the recently introduced persistence landscapes
    corecore