6,434 research outputs found

    Parallel Algorithms for Geometric Graph Problems

    Full text link
    We give algorithms for geometric graph problems in the modern parallel models inspired by MapReduce. For example, for the Minimum Spanning Tree (MST) problem over a set of points in the two-dimensional space, our algorithm computes a (1+ϵ)(1+\epsilon)-approximate MST. Our algorithms work in a constant number of rounds of communication, while using total space and communication proportional to the size of the data (linear space and near linear time algorithms). In contrast, for general graphs, achieving the same result for MST (or even connectivity) remains a challenging open problem, despite drawing significant attention in recent years. We develop a general algorithmic framework that, besides MST, also applies to Earth-Mover Distance (EMD) and the transportation cost problem. Our algorithmic framework has implications beyond the MapReduce model. For example it yields a new algorithm for computing EMD cost in the plane in near-linear time, n1+oϵ(1)n^{1+o_\epsilon(1)}. We note that while recently Sharathkumar and Agarwal developed a near-linear time algorithm for (1+ϵ)(1+\epsilon)-approximating EMD, our algorithm is fundamentally different, and, for example, also solves the transportation (cost) problem, raised as an open question in their work. Furthermore, our algorithm immediately gives a (1+ϵ)(1+\epsilon)-approximation algorithm with nδn^{\delta} space in the streaming-with-sorting model with 1/δO(1)1/\delta^{O(1)} passes. As such, it is tempting to conjecture that the parallel models may also constitute a concrete playground in the quest for efficient algorithms for EMD (and other similar problems) in the vanilla streaming model, a well-known open problem

    Error Metrics for Learning Reliable Manifolds from Streaming Data

    Full text link
    Spectral dimensionality reduction is frequently used to identify low-dimensional structure in high-dimensional data. However, learning manifolds, especially from the streaming data, is computationally and memory expensive. In this paper, we argue that a stable manifold can be learned using only a fraction of the stream, and the remaining stream can be mapped to the manifold in a significantly less costly manner. Identifying the transition point at which the manifold is stable is the key step. We present error metrics that allow us to identify the transition point for a given stream by quantitatively assessing the quality of a manifold learned using Isomap. We further propose an efficient mapping algorithm, called S-Isomap, that can be used to map new samples onto the stable manifold. We describe experiments on a variety of data sets that show that the proposed approach is computationally efficient without sacrificing accuracy

    Optimal lower bounds for universal relation, and for samplers and finding duplicates in streams

    Full text link
    In the communication problem UR\mathbf{UR} (universal relation) [KRW95], Alice and Bob respectively receive x,y{0,1}nx, y \in\{0,1\}^n with the promise that xyx\neq y. The last player to receive a message must output an index ii such that xiyix_i\neq y_i. We prove that the randomized one-way communication complexity of this problem in the public coin model is exactly Θ(min{n,log(1/δ)log2(nlog(1/δ))})\Theta(\min\{n,\log(1/\delta)\log^2(\frac n{\log(1/\delta)})\}) for failure probability δ\delta. Our lower bound holds even if promised support(y)support(x)\mathop{support}(y)\subset \mathop{support}(x). As a corollary, we obtain optimal lower bounds for p\ell_p-sampling in strict turnstile streams for 0p<20\le p < 2, as well as for the problem of finding duplicates in a stream. Our lower bounds do not need to use large weights, and hold even if promised x{0,1}nx\in\{0,1\}^n at all points in the stream. We give two different proofs of our main result. The first proof demonstrates that any algorithm A\mathcal A solving sampling problems in turnstile streams in low memory can be used to encode subsets of [n][n] of certain sizes into a number of bits below the information theoretic minimum. Our encoder makes adaptive queries to A\mathcal A throughout its execution, but done carefully so as to not violate correctness. This is accomplished by injecting random noise into the encoder's interactions with A\mathcal A, which is loosely motivated by techniques in differential privacy. Our second proof is via a novel randomized reduction from Augmented Indexing [MNSW98] which needs to interact with A\mathcal A adaptively. To handle the adaptivity we identify certain likely interaction patterns and union bound over them to guarantee correct interaction on all of them. To guarantee correctness, it is important that the interaction hides some of its randomness from A\mathcal A in the reduction.Comment: merge of arXiv:1703.08139 and of work of Kapralov, Woodruff, and Yahyazade

    A Study of Actor and Action Semantic Retention in Video Supervoxel Segmentation

    Full text link
    Existing methods in the semantic computer vision community seem unable to deal with the explosion and richness of modern, open-source and social video content. Although sophisticated methods such as object detection or bag-of-words models have been well studied, they typically operate on low level features and ultimately suffer from either scalability issues or a lack of semantic meaning. On the other hand, video supervoxel segmentation has recently been established and applied to large scale data processing, which potentially serves as an intermediate representation to high level video semantic extraction. The supervoxels are rich decompositions of the video content: they capture object shape and motion well. However, it is not yet known if the supervoxel segmentation retains the semantics of the underlying video content. In this paper, we conduct a systematic study of how well the actor and action semantics are retained in video supervoxel segmentation. Our study has human observers watching supervoxel segmentation videos and trying to discriminate both actor (human or animal) and action (one of eight everyday actions). We gather and analyze a large set of 640 human perceptions over 96 videos in 3 different supervoxel scales. Furthermore, we conduct machine recognition experiments on a feature defined on supervoxel segmentation, called supervoxel shape context, which is inspired by the higher order processes in human perception. Our ultimate findings suggest that a significant amount of semantics have been well retained in the video supervoxel segmentation and can be used for further video analysis.Comment: This article is in review at the International Journal of Semantic Computin
    corecore