Modern machine learning systems are increasingly trained on large amounts of
data embedded in high-dimensional spaces. Often this is done without analyzing
the structure of the dataset. In this work, we propose a framework to study the
geometric structure of the data. We make use of our recently introduced
non-negative kernel (NNK) regression graphs to estimate the point density,
intrinsic dimension, and the linearity of the data manifold (curvature). We
further generalize the graph construction and geometric estimation to multiple
scale by iteratively merging neighborhoods in the input data. Our experiments
demonstrate the effectiveness of our proposed approach over other baselines in
estimating the local geometry of the data manifolds on synthetic and real
datasets