27 research outputs found

    Data-Oblivious Graph Algorithms in Outsourced External Memory

    Full text link
    Motivated by privacy preservation for outsourced data, data-oblivious external memory is a computational framework where a client performs computations on data stored at a semi-trusted server in a way that does not reveal her data to the server. This approach facilitates collaboration and reliability over traditional frameworks, and it provides privacy protection, even though the server has full access to the data and he can monitor how it is accessed by the client. The challenge is that even if data is encrypted, the server can learn information based on the client data access pattern; hence, access patterns must also be obfuscated. We investigate privacy-preserving algorithms for outsourced external memory that are based on the use of data-oblivious algorithms, that is, algorithms where each possible sequence of data accesses is independent of the data values. We give new efficient data-oblivious algorithms in the outsourced external memory model for a number of fundamental graph problems. Our results include new data-oblivious external-memory methods for constructing minimum spanning trees, performing various traversals on rooted trees, answering least common ancestor queries on trees, computing biconnected components, and forming open ear decompositions. None of our algorithms make use of constant-time random oracles.Comment: 20 page

    Worst-Case Optimal Algorithms for Parallel Query Processing

    Get PDF
    In this paper, we study the communication complexity for the problem of computing a conjunctive query on a large database in a parallel setting with pp servers. In contrast to previous work, where upper and lower bounds on the communication were specified for particular structures of data (either data without skew, or data with specific types of skew), in this work we focus on worst-case analysis of the communication cost. The goal is to find worst-case optimal parallel algorithms, similar to the work of [18] for sequential algorithms. We first show that for a single round we can obtain an optimal worst-case algorithm. The optimal load for a conjunctive query qq when all relations have size equal to MM is O(M/p1/ψ∗)O(M/p^{1/\psi^*}), where ψ∗\psi^* is a new query-related quantity called the edge quasi-packing number, which is different from both the edge packing number and edge cover number of the query hypergraph. For multiple rounds, we present algorithms that are optimal for several classes of queries. Finally, we show a surprising connection to the external memory model, which allows us to translate parallel algorithms to external memory algorithms. This technique allows us to recover (within a polylogarithmic factor) several recent results on the I/O complexity for computing join queries, and also obtain optimal algorithms for other classes of queries

    On the Computational Complexity of MapReduce

    Full text link
    In this paper we study MapReduce computations from a complexity-theoretic perspective. First, we formulate a uniform version of the MRC model of Karloff et al. (2010). We then show that the class of regular languages, and moreover all of sublogarithmic space, lies in constant round MRC. This result also applies to the MPC model of Andoni et al. (2014). In addition, we prove that, conditioned on a variant of the Exponential Time Hypothesis, there are strict hierarchies within MRC so that increasing the number of rounds or the amount of time per processor increases the power of MRC. To the best of our knowledge we are the first to approach the MapReduce model with complexity-theoretic techniques, and our work lays the foundation for further analysis relating MapReduce to established complexity classes

    Communication Steps for Parallel Query Processing

    Full text link
    We consider the problem of computing a relational query qq on a large input database of size nn, using a large number pp of servers. The computation is performed in rounds, and each server can receive only O(n/p1−ε)O(n/p^{1-\varepsilon}) bits of data, where ε∈[0,1]\varepsilon \in [0,1] is a parameter that controls replication. We examine how many global communication steps are needed to compute qq. We establish both lower and upper bounds, in two settings. For a single round of communication, we give lower bounds in the strongest possible model, where arbitrary bits may be exchanged; we show that any algorithm requires ε≥1−1/τ∗\varepsilon \geq 1-1/\tau^*, where τ∗\tau^* is the fractional vertex cover of the hypergraph of qq. We also give an algorithm that matches the lower bound for a specific class of databases. For multiple rounds of communication, we present lower bounds in a model where routing decisions for a tuple are tuple-based. We show that for the class of tree-like queries there exists a tradeoff between the number of rounds and the space exponent ε\varepsilon. The lower bounds for multiple rounds are the first of their kind. Our results also imply that transitive closure cannot be computed in O(1) rounds of communication
    corecore