1,367 research outputs found

    Deterministic Sampling and Range Counting in Geometric Data Streams

    Get PDF
    We present memory-efficient deterministic algorithms for constructing epsilon-nets and epsilon-approximations of streams of geometric data. Unlike probabilistic approaches, these deterministic samples provide guaranteed bounds on their approximation factors. We show how our deterministic samples can be used to answer approximate online iceberg geometric queries on data streams. We use these techniques to approximate several robust statistics of geometric data streams, including Tukey depth, simplicial depth, regression depth, the Thiel-Sen estimator, and the least median of squares. Our algorithms use only a polylogarithmic amount of memory, provided the desired approximation factors are inverse-polylogarithmic. We also include a lower bound for non-iceberg geometric queries.Comment: 12 pages, 1 figur

    On Strong Centerpoints

    Full text link
    Let PP be a set of nn points in Rd\mathbb{R}^d and F\mathcal{F} be a family of geometric objects. We call a point xPx \in P a strong centerpoint of PP w.r.t F\mathcal{F} if xx is contained in all FFF \in \mathcal{F} that contains more than cncn points from PP, where cc is a fixed constant. A strong centerpoint does not exist even when F\mathcal{F} is the family of halfspaces in the plane. We prove the existence of strong centerpoints with exact constants for convex polytopes defined by a fixed set of orientations. We also prove the existence of strong centerpoints for abstract set systems with bounded intersection

    On interference among moving sensors and related problems

    Full text link
    We show that for any set of nn points moving along "simple" trajectories (i.e., each coordinate is described with a polynomial of bounded degree) in d\Re^d and any parameter 2kn2 \le k \le n, one can select a fixed non-empty subset of the points of size O(klogk)O(k \log k), such that the Voronoi diagram of this subset is "balanced" at any given time (i.e., it contains O(n/k)O(n/k) points per cell). We also show that the bound O(klogk)O(k \log k) is near optimal even for the one dimensional case in which points move linearly in time. As applications, we show that one can assign communication radii to the sensors of a network of nn moving sensors so that at any given time their interference is O(nlogn)O(\sqrt{n\log n}). We also show some results in kinetic approximate range counting and kinetic discrepancy. In order to obtain these results, we extend well-known results from ε\varepsilon-net theory to kinetic environments

    The Cross-Validated Adaptive Epsilon-Net Estimator

    Get PDF
    Suppose that we observe a sample of independent and identically distributed realizations of a random variable. Assume that the parameter of interest can be defined as the minimizer, over a suitably defined parameter space, of the expectation (with respect to the distribution of the random variable) of a particular (loss) function of a candidate parameter value and the random variable. Examples of commonly used loss functions are the squared error loss function in regression and the negative log-density loss function in density estimation. Minimizing the empirical risk (i.e., the empirical mean of the loss function) over the entire parameter space typically results in ill-defined or too variable estimators of the parameter of interest (i.e., the risk minimizer for the true data generating distribution). In this article, we propose a cross-validated epsilon-net estimation methodology that covers a broad class of estimation problems, including multivariate outcome prediction and multivariate density estimation. An epsilon-net sieve of a subspace of the parameter space is defined as a collection of finite sets of points, the epsilon-nets indexed by epsilon, which approximate the subspace up till a resolution of epsilon. Given a collection of subspaces of the parameter space, one constructs an epsilon-net sieve for each of the subspaces. For each choice of subspace and each value of the resolution epsilon, one defines a candidate estimator as the minimizer of the empirical risk over the corresponding epsilon-net. The cross-validated epsilon-net estimator is then defined as the candidate estimator corresponding to the choice of subspace and epsilon-value minimizing the cross-validated empirical risk. We derive a finite sample inequality which proves that the proposed estimator achieves the adaptive optimal minimax rate of convergence, where the adaptivity is achieved by considering epsilon-net sieves for various subspaces. We also address the implementation of the cross-validated epsilon-net estimation procedure. In the context of a linear regression model, we present results of a preliminary simulation study comparing the cross-validated epsilon-net estimator to the cross-validated L^1-penalized least squares estimator (LASSO) and the least angle regression estimator (LARS). Finally, we discuss generalizations of the proposed estimation methodology to censored data structures
    corecore