4,111,596 research outputs found
Multimapper: Data Density Sensitive Topological Visualization
Mapper is an algorithm that summarizes the topological information contained
in a dataset and provides an insightful visualization. It takes as input a
point cloud which is possibly high-dimensional, a filter function on it and an
open cover on the range of the function. It returns the nerve simplicial
complex of the pullback of the cover. Mapper can be considered a discrete
approximation of the topological construct called Reeb space, as analysed in
the -dimensional case by [Carriere et al.,2018]. Despite its success in
obtaining insights in various fields such as in [Kamruzzaman et al., 2016],
Mapper is an ad hoc technique requiring lots of parameter tuning. There is also
no measure to quantify goodness of the resulting visualization, which often
deviates from the Reeb space in practice. In this paper, we introduce a new
cover selection scheme for data that reduces the obscuration of topological
information at both the computation and visualisation steps. To achieve this,
we replace global scale selection of cover with a scale selection scheme
sensitive to local density of data points. We also propose a method to detect
some deviations in Mapper from Reeb space via computation of persistence
features on the Mapper graph.Comment: Accepted at ICDM
Data augmentation for galaxy density map reconstruction
The matter density is an important knowledge for today cosmology as many
phenomena are linked to matter fluctuations. However, this density is not
directly available, but estimated through lensing maps or galaxy surveys. In
this article, we focus on galaxy surveys which are incomplete and noisy
observations of the galaxy density. Incomplete, as part of the sky is
unobserved or unreliable. Noisy as they are count maps degraded by Poisson
noise. Using a data augmentation method, we propose a two-step method for
recovering the density map, one step for inferring missing data and one for
estimating of the density. The results show that the missing areas are
efficiently inferred and the statistical properties of the maps are very well
preserved
Recursive kernel density estimators under missing data
In this paper we propose an automatic bandwidth selection of the recursive
kernel density estimators with missing data in the context of global and local
density estimation. We showed that, using the selected bandwidth and a special
stepsize, the proposed recursive estimators outperformed the nonrecursive one
in terms of estimation error in the case of global estimation. However, the
recursive estimators are much better in terms of computational costs. We
corroborated these theoretical results through simulation studies and on the
simulated data of the Aquitaine cohort of HIV-1 infected patients and on the
coriell cell lines using the chromosome number 11.Comment: to appear in Communication in Statistics - Theory and Method
Electron density retrieval from truncated Radio Occultation GNSS data
This paper summarizes the definition and validation of two complementary new strategies, to invert incomplete Global Navigation Satellite System Radio-Occultation (RO) ionospheric measurements, such as the ones to be provided by the future EUMETSAT Polar System Second Generation. It will provide RO measurements with impact parameter much below the Low Earth Orbiters' height (817 km): from 500 km down approximately. The first presented method to invert truncated RO data is denoted as Abel-VaryChap Hybrid modeling from topside Incomplete Global Navigation Satellite System RO data, based on simple First Principles, very precise, and well suited for postprocessing. And the second method is denoted as Simple Estimation of Electron density profiles from topside Incomplete RO data, is less precise, but yields very fast estimations, suitable for Near Real-Time determination. Both techniques will be described and assessed with a set of 546 representative COSMIC/FORMOSAT-3 ROs, with relative errors of 7% and 11% for Abel-VaryChap Hybrid modeling from topside Incomplete Global Navigation Satellite System RO data and Simple Estimation of Electron density profiles from topside Incomplete RO data, respectively, with 20 min and 15 s, respectively, of computational time per occultation in our Intel I7 PC.Peer ReviewedPostprint (published version
Optimal Bayes Classifiers for Functional Data and Density Ratios
Bayes classifiers for functional data pose a challenge. This is because
probability density functions do not exist for functional data. As a
consequence, the classical Bayes classifier using density quotients needs to be
modified. We propose to use density ratios of projections on a sequence of
eigenfunctions that are common to the groups to be classified. The density
ratios can then be factored into density ratios of individual functional
principal components whence the classification problem is reduced to a sequence
of nonparametric one-dimensional density estimates. This is an extension to
functional data of some of the very earliest nonparametric Bayes classifiers
that were based on simple density ratios in the one-dimensional case. By means
of the factorization of the density quotients the curse of dimensionality that
would otherwise severely affect Bayes classifiers for functional data can be
avoided. We demonstrate that in the case of Gaussian functional data, the
proposed functional Bayes classifier reduces to a functional version of the
classical quadratic discriminant. A study of the asymptotic behavior of the
proposed classifiers in the large sample limit shows that under certain
conditions the misclassification rate converges to zero, a phenomenon that has
been referred to as "perfect classification". The proposed classifiers also
perform favorably in finite sample applications, as we demonstrate in
comparisons with other functional classifiers in simulations and various data
applications, including wine spectral data, functional magnetic resonance
imaging (fMRI) data for attention deficit hyperactivity disorder (ADHD)
patients, and yeast gene expression data
- …
