768 research outputs found

    Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently

    Full text link
    \newcommand{\dist}{\operatorname{dist}} In this paper we define the notion of a probabilistic neighborhood in spatial data: Let a set PP of nn points in Rd\mathbb{R}^d, a query point qRdq \in \mathbb{R}^d, a distance metric \dist, and a monotonically decreasing function f:R+[0,1]f : \mathbb{R}^+ \rightarrow [0,1] be given. Then a point pPp \in P belongs to the probabilistic neighborhood N(q,f)N(q, f) of qq with respect to ff with probability f(\dist(p,q)). We envision applications in facility location, sensor networks, and other scenarios where a connection between two entities becomes less likely with increasing distance. A straightforward query algorithm would determine a probabilistic neighborhood in Θ(nd)\Theta(n\cdot d) time by probing each point in PP. To answer the query in sublinear time for the planar case, we augment a quadtree suitably and design a corresponding query algorithm. Our theoretical analysis shows that -- for certain distributions of planar PP -- our algorithm answers a query in O((N(q,f)+n)logn)O((|N(q,f)| + \sqrt{n})\log n) time with high probability (whp). This matches up to a logarithmic factor the cost induced by quadtree-based algorithms for deterministic queries and is asymptotically faster than the straightforward approach whenever N(q,f)o(n/logn)|N(q,f)| \in o(n / \log n). As practical proofs of concept we use two applications, one in the Euclidean and one in the hyperbolic plane. In particular, our results yield the first generator for random hyperbolic graphs with arbitrary temperatures in subquadratic time. Moreover, our experimental data show the usefulness of our algorithm even if the point distribution is unknown or not uniform: The running time savings over the pairwise probing approach constitute at least one order of magnitude already for a modest number of points and queries.Comment: The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-44543-4_3

    Discrete denoising of heterogenous two-dimensional data

    Full text link
    We consider discrete denoising of two-dimensional data with characteristics that may be varying abruptly between regions. Using a quadtree decomposition technique and space-filling curves, we extend the recently developed S-DUDE (Shifting Discrete Universal DEnoiser), which was tailored to one-dimensional data, to the two-dimensional case. Our scheme competes with a genie that has access, in addition to the noisy data, also to the underlying noiseless data, and can employ mm different two-dimensional sliding window denoisers along mm distinct regions obtained by a quadtree decomposition with mm leaves, in a way that minimizes the overall loss. We show that, regardless of what the underlying noiseless data may be, the two-dimensional S-DUDE performs essentially as well as this genie, provided that the number of distinct regions satisfies m=o(n)m=o(n), where nn is the total size of the data. The resulting algorithm complexity is still linear in both nn and mm, as in the one-dimensional case. Our experimental results show that the two-dimensional S-DUDE can be effective when the characteristics of the underlying clean image vary across different regions in the data.Comment: 16 pages, submitted to IEEE Transactions on Information Theor

    Visual-hint Boundary to Segment Algorithm for Image Segmentation

    Full text link
    Image segmentation has been a very active research topic in image analysis area. Currently, most of the image segmentation algorithms are designed based on the idea that images are partitioned into a set of regions preserving homogeneous intra-regions and inhomogeneous inter-regions. However, human visual intuition does not always follow this pattern. A new image segmentation method named Visual-Hint Boundary to Segment (VHBS) is introduced, which is more consistent with human perceptions. VHBS abides by two visual hint rules based on human perceptions: (i) the global scale boundaries tend to be the real boundaries of the objects; (ii) two adjacent regions with quite different colors or textures tend to result in the real boundaries between them. It has been demonstrated by experiments that, compared with traditional image segmentation method, VHBS has better performance and also preserves higher computational efficiency.Comment: 45 page

    Implementation and application of adaptive mesh refinement for thermochemical mantle convection studies

    Get PDF
    Numerical modeling of mantle convection is challenging. Owing to the multiscale nature of mantle dynamics, high resolution is often required in localized regions, with coarser resolution being sufficient elsewhere. When investigating thermochemical mantle convection, high resolution is required to resolve sharp and often discontinuous boundaries between distinct chemical components. In this paper, we present a 2-D finite element code with adaptive mesh refinement techniques for simulating compressible thermochemical mantle convection. By comparing model predictions with a range of analytical and previously published benchmark solutions, we demonstrate the accuracy of our code. By refining and coarsening the mesh according to certain criteria and dynamically adjusting the number of particles in each element, our code can simulate such problems efficiently, dramatically reducing the computational requirements (in terms of memory and CPU time) when compared to a fixed, uniform mesh simulation. The resolving capabilities of the technique are further highlighted by examining plume‐induced entrainment in a thermochemical mantle convection simulation

    A limit process for partial match queries in random quadtrees and 22-d trees

    Full text link
    We consider the problem of recovering items matching a partially specified pattern in multidimensional trees (quadtrees and kk-d trees). We assume the traditional model where the data consist of independent and uniform points in the unit square. For this model, in a structure on nn points, it is known that the number of nodes Cn(ξ)C_n(\xi ) to visit in order to report the items matching a random query ξ\xi, independent and uniformly distributed on [0,1][0,1], satisfies E[Cn(ξ)]κnβ\mathbf {E}[{C_n(\xi )}]\sim\kappa n^{\beta}, where κ\kappa and β\beta are explicit constants. We develop an approach based on the analysis of the cost Cn(s)C_n(s) of any fixed query s[0,1]s\in[0,1], and give precise estimates for the variance and limit distribution of the cost Cn(x)C_n(x). Our results permit us to describe a limit process for the costs Cn(x)C_n(x) as xx varies in [0,1][0,1]; one of the consequences is that E[maxx[0,1]Cn(x)]γnβ\mathbf {E}[{\max_{x\in[0,1]}C_n(x)}]\sim \gamma n^{\beta}; this settles a question of Devroye [Pers. Comm., 2000].Comment: Published in at http://dx.doi.org/10.1214/12-AAP912 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1107.223

    Down the Rabbit Hole: Robust Proximity Search and Density Estimation in Sublinear Space

    Full text link
    For a set of nn points in d\Re^d, and parameters kk and \eps, we present a data structure that answers (1+\eps,k)-\ANN queries in logarithmic time. Surprisingly, the space used by the data-structure is \Otilde (n /k); that is, the space used is sublinear in the input size if kk is sufficiently large. Our approach provides a novel way to summarize geometric data, such that meaningful proximity queries on the data can be carried out using this sketch. Using this, we provide a sublinear space data-structure that can estimate the density of a point set under various measures, including: \begin{inparaenum}[(i)] \item sum of distances of kk closest points to the query point, and \item sum of squared distances of kk closest points to the query point. \end{inparaenum} Our approach generalizes to other distance based estimation of densities of similar flavor. We also study the problem of approximating some of these quantities when using sampling. In particular, we show that a sample of size \Otilde (n /k) is sufficient, in some restricted cases, to estimate the above quantities. Remarkably, the sample size has only linear dependency on the dimension
    corecore