6 research outputs found

    An Experimental Study of Old and New Depth Measures

    No full text
    Data depth is a statistical analysis method that assigns a numeric value to a point based on its centrality relative to a data set. Examples include the half-space depth (also known as Tukey depth), convex-hull peeling depth and L1 depth. Data depth has significant potential as a data analysis tool. The lack of efficient computational tools for depth based analysis of large high-dimensional data sets, however, prevents it from being in widespread use. We provide an experimental evaluation of several existing depth measures on different types of data sets, recognize problems with the existing measures and suggest modifications. Specifically, we show how the L1 depth contours are not indicative of shape and suggest a PCA-based scaling that handles this problem; we demonstrate how most existing depth measures are unable to cope with multimodal data sets and how the newly suggested proximity graph depth addresses this issue; and we explore how depth measures perform when the underlying distribution is not elliptic. Our experimental tool is of independent interest: it is an interactive software tool for the generation of data sets and visualization of the performance of multiple depth measures. The tool uses a hierarchical render-pipeline to allow for diverse data sets and fine control of the visual result. With this tool, new ideas in the field of data depth can be evaluated visually and quickly, allowing researchers to assess and adjust current depth functions

    Dynamic ham-sandwich cuts in the plane

    Get PDF
    We design efficient data structures for dynamically maintaining a ham-sandwich cut of two point sets in the plane subject to insertions and deletions of points in either set. A ham-sandwich cut is a line that simultaneously bisects the cardinality of both point sets. For general point sets, our first data structure supports each operation in O(n 1/3+ε) amortized time and O(n4/3+ε) space. Our second data structure performs faster when each point set decomposes into a small number k of subsets in convex position: it supports insertions and deletions in O (log n) time and ham-sandwich queries in O(k log4 n) time. In addition, if each point set has convex peeling depth k, then we can maintain the decomposition automatically using O (k log n) time per insertion and deletion. Alternatively, we can view each convex point set as a convex polygon, and we show how to find a ham-sandwich cut that bisects the total areas or total perimeters of these polygons in O (k log4 n) time plus the O ((kb) polylog(kb)) time required to approximate the root of a polynomial of degree O (k) up to b bits of precision. We also show how to maintain a partition of the plane by two lines into four regions each containing a quarter of the total point count, area, or perimeter in polylogarithmic time.SCOPUS: ar.jSCOPUS: cp.jinfo:eu-repo/semantics/publishe
    corecore