4,111,596 research outputs found

    Multimapper: Data Density Sensitive Topological Visualization

    Full text link
    Mapper is an algorithm that summarizes the topological information contained in a dataset and provides an insightful visualization. It takes as input a point cloud which is possibly high-dimensional, a filter function on it and an open cover on the range of the function. It returns the nerve simplicial complex of the pullback of the cover. Mapper can be considered a discrete approximation of the topological construct called Reeb space, as analysed in the 11-dimensional case by [Carriere et al.,2018]. Despite its success in obtaining insights in various fields such as in [Kamruzzaman et al., 2016], Mapper is an ad hoc technique requiring lots of parameter tuning. There is also no measure to quantify goodness of the resulting visualization, which often deviates from the Reeb space in practice. In this paper, we introduce a new cover selection scheme for data that reduces the obscuration of topological information at both the computation and visualisation steps. To achieve this, we replace global scale selection of cover with a scale selection scheme sensitive to local density of data points. We also propose a method to detect some deviations in Mapper from Reeb space via computation of persistence features on the Mapper graph.Comment: Accepted at ICDM

    Data augmentation for galaxy density map reconstruction

    Get PDF
    The matter density is an important knowledge for today cosmology as many phenomena are linked to matter fluctuations. However, this density is not directly available, but estimated through lensing maps or galaxy surveys. In this article, we focus on galaxy surveys which are incomplete and noisy observations of the galaxy density. Incomplete, as part of the sky is unobserved or unreliable. Noisy as they are count maps degraded by Poisson noise. Using a data augmentation method, we propose a two-step method for recovering the density map, one step for inferring missing data and one for estimating of the density. The results show that the missing areas are efficiently inferred and the statistical properties of the maps are very well preserved

    Recursive kernel density estimators under missing data

    Full text link
    In this paper we propose an automatic bandwidth selection of the recursive kernel density estimators with missing data in the context of global and local density estimation. We showed that, using the selected bandwidth and a special stepsize, the proposed recursive estimators outperformed the nonrecursive one in terms of estimation error in the case of global estimation. However, the recursive estimators are much better in terms of computational costs. We corroborated these theoretical results through simulation studies and on the simulated data of the Aquitaine cohort of HIV-1 infected patients and on the coriell cell lines using the chromosome number 11.Comment: to appear in Communication in Statistics - Theory and Method

    Electron density retrieval from truncated Radio Occultation GNSS data

    Get PDF
    This paper summarizes the definition and validation of two complementary new strategies, to invert incomplete Global Navigation Satellite System Radio-Occultation (RO) ionospheric measurements, such as the ones to be provided by the future EUMETSAT Polar System Second Generation. It will provide RO measurements with impact parameter much below the Low Earth Orbiters' height (817 km): from 500 km down approximately. The first presented method to invert truncated RO data is denoted as Abel-VaryChap Hybrid modeling from topside Incomplete Global Navigation Satellite System RO data, based on simple First Principles, very precise, and well suited for postprocessing. And the second method is denoted as Simple Estimation of Electron density profiles from topside Incomplete RO data, is less precise, but yields very fast estimations, suitable for Near Real-Time determination. Both techniques will be described and assessed with a set of 546 representative COSMIC/FORMOSAT-3 ROs, with relative errors of 7% and 11% for Abel-VaryChap Hybrid modeling from topside Incomplete Global Navigation Satellite System RO data and Simple Estimation of Electron density profiles from topside Incomplete RO data, respectively, with 20 min and 15 s, respectively, of computational time per occultation in our Intel I7 PC.Peer ReviewedPostprint (published version

    Optimal Bayes Classifiers for Functional Data and Density Ratios

    Full text link
    Bayes classifiers for functional data pose a challenge. This is because probability density functions do not exist for functional data. As a consequence, the classical Bayes classifier using density quotients needs to be modified. We propose to use density ratios of projections on a sequence of eigenfunctions that are common to the groups to be classified. The density ratios can then be factored into density ratios of individual functional principal components whence the classification problem is reduced to a sequence of nonparametric one-dimensional density estimates. This is an extension to functional data of some of the very earliest nonparametric Bayes classifiers that were based on simple density ratios in the one-dimensional case. By means of the factorization of the density quotients the curse of dimensionality that would otherwise severely affect Bayes classifiers for functional data can be avoided. We demonstrate that in the case of Gaussian functional data, the proposed functional Bayes classifier reduces to a functional version of the classical quadratic discriminant. A study of the asymptotic behavior of the proposed classifiers in the large sample limit shows that under certain conditions the misclassification rate converges to zero, a phenomenon that has been referred to as "perfect classification". The proposed classifiers also perform favorably in finite sample applications, as we demonstrate in comparisons with other functional classifiers in simulations and various data applications, including wine spectral data, functional magnetic resonance imaging (fMRI) data for attention deficit hyperactivity disorder (ADHD) patients, and yeast gene expression data
    corecore