Search CORE

813 research outputs found

Declutter and Resample: Towards parameter free denoising

Author: Buchet Mickaël
Dey Tamal K.
Wang Jiayuan
Wang Yusu
Publication venue
Publication date: 01/01/2017
Field of study

In many data analysis applications the following scenario is commonplace: we are given a point set that is supposed to sample a hidden ground truth

K

in a metric space, but it got corrupted with noise so that some of the data points lie far away from

K

creating outliers also termed as {\em ambient noise}. One of the main goals of denoising algorithms is to eliminate such noise so that the curated data lie within a bounded Hausdorff distance of

K

. Popular denoising approaches such as deconvolution and thresholding often require the user to set several parameters and/or to choose an appropriate noise model while guaranteeing only asymptotic convergence. Our goal is to lighten this burden as much as possible while ensuring theoretical guarantees in all cases. Specifically, first, we propose a simple denoising algorithm that requires only a single parameter but provides a theoretical guarantee on the quality of the output on general input points. We argue that this single parameter cannot be avoided. We next present a simple algorithm that avoids even this parameter by paying for it with a slight strengthening of the sampling condition on the input points which is not unrealistic. We also provide some preliminary empirical evidence that our algorithms are effective in practice

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Journal of Computational Geometry (JoCG - Carleton University, Computational Geometry Lab)

Approximating Loops in a Shortest Homology Basis from Point Data

Author: Dey Tamal K.
Sun Jian
Wang Yusu
Publication venue
Publication date: 02/12/2009
Field of study

Inference of topological and geometric attributes of a hidden manifold from its point data is a fundamental problem arising in many scientific studies and engineering applications. In this paper we present an algorithm to compute a set of loops from a point data that presumably sample a smooth manifold

M\subset \mathbb{R}^d

. These loops approximate a {\em shortest} basis of the one dimensional homology group

H_1(M)

over coefficients in finite field

\mathbb{Z}_2

. Previous results addressed the issue of computing the rank of the homology groups from point data, but there is no result on approximating the shortest basis of a manifold from its point sample. In arriving our result, we also present a polynomial time algorithm for computing a shortest basis of

H_1(K)

for any finite {\em simplicial complex}

K

whose edges have non-negative weights

arXiv.org e-Print Archive

CiteSeerX

Crossref

Towards Persistence-Based Reconstruction in Euclidean Spaces

Author: Chazal Frédéric
Oudot Steve
Publication venue
Publication date: 01/01/2007
Field of study

Manifold reconstruction has been extensively studied for the last decade or so, especially in two and three dimensions. Recently, significant improvements were made in higher dimensions, leading to new methods to reconstruct large classes of compact subsets of Euclidean space

\R^d

. However, the complexities of these methods scale up exponentially with d, which makes them impractical in medium or high dimensions, even for handling low-dimensional submanifolds. In this paper, we introduce a novel approach that stands in-between classical reconstruction and topological estimation, and whose complexity scales up with the intrinsic dimension of the data. Specifically, when the data points are sufficiently densely sampled from a smooth

m

-submanifold of

\R^d

, our method retrieves the homology of the submanifold in time at most

c(m)n^5

, where

n

is the size of the input and

c(m)

is a constant depending solely on

m

. It can also provably well handle a wide range of compact subsets of

\R^d

, though with worse complexities. Along the way to proving the correctness of our algorithm, we obtain new results on \v{C}ech, Rips, and witness complex filtrations in Euclidean spaces

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Manifold Fitting

Author: Li Bingjie
Su Jiaji
Yao Zhigang
Yau Shing-Tung
Publication venue
Publication date: 12/08/2023
Field of study

While classical data analysis has addressed observations that are real numbers or elements of a real vector space, at present many statistical problems of high interest in the sciences address the analysis of data that consist of more complex objects, taking values in spaces that are naturally not (Euclidean) vector spaces but which still feature some geometric structure. Manifold fitting is a long-standing problem, and has finally been addressed in recent years by Fefferman et al. (2020, 2021a). We develop a method with a theory guarantee that fits a

d

-dimensional underlying manifold from noisy observations sampled in the ambient space

\mathbb{R}^D

. The new approach uses geometric structures to obtain the manifold estimator in the form of image sets via a two-step mapping approach. We prove that, under certain mild assumptions and with a sample size

N=\mathcal{O}(\sigma^{(-d+3)})

, these estimators are true

d

-dimensional smooth manifolds whose estimation error, as measured by the Hausdorff distance, is bounded by

\mathcal{O}(\sigma^2\log(1/\sigma))

with high probability. Compared with the existing approaches proposed in Fefferman et al. (2018, 2021b); Genovese et al. (2014); Yao and Xia (2019), our method exhibits superior efficiency while attaining very low error rates with a significantly reduced sample size, which scales polynomially in

\sigma^{-1}

and exponentially in

d

. Extensive simulations are performed to validate our theoretical results. Our findings are relevant to various fields involving high-dimensional data in machine learning. Furthermore, our method opens up new avenues for existing non-Euclidean statistical methods in the sense that it has the potential to unify them to analyze data on manifolds in the ambience space domain.Comment: 60 page

arXiv.org e-Print Archive

What's the Situation with Intelligent Mesh Generation: A Survey and Perspectives

Author: Gu Xianfeng
Lei Na
Li Ying
Li Zezeng
Xu Zebin
Publication venue
Publication date: 23/05/2023
Field of study

Intelligent Mesh Generation (IMG) represents a novel and promising field of research, utilizing machine learning techniques to generate meshes. Despite its relative infancy, IMG has significantly broadened the adaptability and practicality of mesh generation techniques, delivering numerous breakthroughs and unveiling potential future pathways. However, a noticeable void exists in the contemporary literature concerning comprehensive surveys of IMG methods. This paper endeavors to fill this gap by providing a systematic and thorough survey of the current IMG landscape. With a focus on 113 preliminary IMG methods, we undertake a meticulous analysis from various angles, encompassing core algorithm techniques and their application scope, agent learning objectives, data types, targeted challenges, as well as advantages and limitations. We have curated and categorized the literature, proposing three unique taxonomies based on key techniques, output mesh unit elements, and relevant input data types. This paper also underscores several promising future research directions and challenges in IMG. To augment reader accessibility, a dedicated IMG project page is available at \url{https://github.com/xzb030/IMG_Survey}

arXiv.org e-Print Archive