1,140 research outputs found

    Fast evaluation of union-intersection expressions

    Full text link
    We show how to represent sets in a linear space data structure such that expressions involving unions and intersections of sets can be computed in a worst-case efficient way. This problem has applications in e.g. information retrieval and database systems. We mainly consider the RAM model of computation, and sets of machine words, but also state our results in the I/O model. On a RAM with word size ww, a special case of our result is that the intersection of mm (preprocessed) sets, containing nn elements in total, can be computed in expected time O(n(log⁥w)2/w+km)O(n (\log w)^2 / w + km), where kk is the number of elements in the intersection. If the first of the two terms dominates, this is a factor w1−o(1)w^{1-o(1)} faster than the standard solution of merging sorted lists. We show a cell probe lower bound of time Ω(n/(wmlog⁥m)+(1−log⁥kw)k)\Omega(n/(w m \log m)+ (1-\tfrac{\log k}{w}) k), meaning that our upper bound is nearly optimal for small mm. Our algorithm uses a novel combination of approximate set representations and word-level parallelism

    Flexible constrained sampling with guarantees for pattern mining

    Get PDF
    Pattern sampling has been proposed as a potential solution to the infamous pattern explosion. Instead of enumerating all patterns that satisfy the constraints, individual patterns are sampled proportional to a given quality measure. Several sampling algorithms have been proposed, but each of them has its limitations when it comes to 1) flexibility in terms of quality measures and constraints that can be used, and/or 2) guarantees with respect to sampling accuracy. We therefore present Flexics, the first flexible pattern sampler that supports a broad class of quality measures and constraints, while providing strong guarantees regarding sampling accuracy. To achieve this, we leverage the perspective on pattern mining as a constraint satisfaction problem and build upon the latest advances in sampling solutions in SAT as well as existing pattern mining algorithms. Furthermore, the proposed algorithm is applicable to a variety of pattern languages, which allows us to introduce and tackle the novel task of sampling sets of patterns. We introduce and empirically evaluate two variants of Flexics: 1) a generic variant that addresses the well-known itemset sampling task and the novel pattern set sampling task as well as a wide range of expressive constraints within these tasks, and 2) a specialized variant that exploits existing frequent itemset techniques to achieve substantial speed-ups. Experiments show that Flexics is both accurate and efficient, making it a useful tool for pattern-based data exploration.Comment: Accepted for publication in Data Mining & Knowledge Discovery journal (ECML/PKDD 2017 journal track

    Tight bounds for parallel randomized load balancing

    Get PDF
    Given a distributed system of n balls and n bins, how evenly can we distribute the balls to the bins, minimizing communication? The fastest non-adaptive and symmetric algorithm achieving a constant maximum bin load requires Θ(loglogn) rounds, and any such algorithm running for r∈O(1) rounds incurs a bin load of Ω((logn/loglogn)1/r). In this work, we explore the fundamental limits of the general problem. We present a simple adaptive symmetric algorithm that achieves a bin load of 2 in log∗n+O(1) communication rounds using O(n) messages in total. Our main result, however, is a matching lower bound of (1−o(1))log∗n on the time complexity of symmetric algorithms that guarantee small bin loads. The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls need not be globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time.German Research Foundation (DFG, reference number Le 3107/1-1)Society of Swiss Friends of the Weizmann Institute of ScienceSwiss National Fun

    A Comprehensive Protocol Suite for Secure Two-Party Computation

    Get PDF
    Turvaline ĂŒhisarvutus vĂ”imaldab ĂŒksteist mitte usaldavatel osapooltel teha arvutusi tundlikel andmetel nii, et kellegi privaatsed andmed ei leki teistele osapooltele. Sharemind on kaua arenduses olnud turvalise ĂŒhisarvutuse platvorm, mis jagab tundlikke andmeid ĂŒhissalastuse abil kolme serveri vahel. Sharemindi kolme osapoolega protokolle on kasutatud suuremahuliste rakenduste loomisel. IgapĂ€evaelus leidub rakendusi, mille puhul kahe osapoolega juurustusmudel on kolme osapoolega variandist sobivam majanduslikel vĂ”i organisatoorsetel pĂ”hjustel. Selles töös kirjeldame ja teostame tĂ€ieliku protokollistiku kahe osapoolega turvaliste arvutuste jaoks. Loodud protokollistiku eesmĂ€rk on pakkuda kolme osapoolega juurutusmudelile vĂ”rdvÀÀrne alternatiiv, mis on ka jĂ”udluses vĂ”rreldaval tasemel. Kahe osapoole vahelised turvalise aritmeetika protokollid tuginevad peamiselt Beaveri kolmikute ette arvutamisele. Selleks, et saavutada vajalikku jĂ”udlust, oleme vĂ€lja töötanud tĂ”husad ette arvutamise meetodid, mis kasutavad uudsel viisil N-sĂ”numi pimeedastuse pikendamise protokolle. Meie meetodite eeliseks on alternatiividest vĂ€iksem vĂ”rgusuhtluse maht. Töös kĂ€sitleme ka insenertehnilisi vĂ€ljakutseid, mis selliste meetodite teostamisel ette tulid. Töös esitame kirjeldatud konstruktsioonide turvalisuse ja korrektsuse tĂ”estused. Selleks kasutame vĂ€hem eelduseid, kui tĂŒĂŒpilised teaduskirjanduses leiduvad tĂ”estused. Üheks peamiseks saavutuseks on juhusliku oraakli mudeli vĂ€timine. Meie kirjeldatud ja teostatud tĂ€isarvuaritmeetika ja andmetĂŒĂŒpide vaheliste teisendusprotokollide jĂ”udlustulemused on vĂ”rreldavad kolme osapoole protokollide jĂ”udlusega. Meie töö tulemusena saab Sharemindi platvormil teostada kahe osapoolega turvalisi ĂŒhisarvutusi.Secure multi-party computation allows a number of distrusting parties to collaborate in extracting new knowledge from their joint private data, without any party learning the other participants' secrets in the process. The efficient and mature Sharemind secure computation platform has relied on a three-party suite of protocols based on secret sharing for supporting large real-world applications. However, in some scenarios, a two-party model is a better fit when no natural third party is involved in the application. In this work, we design and implement a full protocol suite for two-party computations on Sharemind, providing an alternative and viable solution in such cases. We aim foremost for efficiency that is on par with the existing three-party protocols. To this end, we introduce more efficient techniques for the precomputation of Beaver triples using oblivious transfer extension, as the two-party protocols for arithmetic fundamentally rely on efficient triple generation. We reduce communication costs compared to existing methods by using 1-out-of-N oblivious transfer extension in a novel way, and provide insights into engineering challenges for efficiently implementing these methods. Furthermore, we show security of our constructions using strictly weaker assumptions than have been previously required by avoiding the random oracle model. We describe and implement a large amount of integer operations and data conversion protocols that are competitive with the existing three-party protocols, providing an overall solid foundation for two-party computations on Sharemind

    Faster Oblivious Transfer Extension and Its Impact on Secure Computation

    Get PDF
    Secure two-party computation allows two parties to evaluate a function on their private inputs while keeping all information private except what can be inferred from the outputs. A major building block and the foundation for many efficient secure computation protocols is oblivious transfer (OT). In an OT protocol a sender inputs two messages (x_{0}, x_{1}) while a receiver with choice bit c wants to receive message x_{c}.The OT protocol execution guarantees that the sender learns no information about c and the receiver learns no information about x_{1−c}. This thesis focuses on the efficient generation of OTs and their use in secure computation. In particular, we show how to compute OTs more efficiently, improve generic secure computation protocols which can be used to securely evaluate any functionality, and develop highly efficient special-purpose protocols for private set intersection (PSI). We outline our contributions in more detail next. More Efficient OT Extensions. The most efficient OT protocols are based on public-key cryptography and require a constant number of exponentiations per OT. However, for many practical applications where millions to billions of OTs need to be computed, these exponentiations become prohibitively slow. To enable these applications, OT extension protocols [Bea96, IKNP03] can be used, which extend a small number of public-key-based OTs to an arbitrarily large number using cheap symmetric-key cryptography only. We improve the computation and communication efficiency of OT extension protocols and show how to achieve security against malicious adversaries, which can arbitrarily deviate from the protocol, at low overhead. Our resulting protocols can compute several million of OTs per second and we show that, in contrast to previous belief, the local computation overhead for computing OTs is so low that the main bottleneck is the network bandwidth. Parts of these results are published in: ‱ G. Asharov, Y. Lindell, T. Schneider, M. Zohner. More Efficient Oblivious Transfer and Extensions for Faster Secure Computation. In 20th ACM Conference on Computer and Communications Security (CCS’13). ‱ G. Asharov, Y. Lindell, T. Schneider, M. Zohner. More Efficient Oblivious Transfer Extensions with Security for Malicious Adversaries. In 34th Advances in Cryptology – EUROCRYPT’15. ‱ G. Asharov, Y. Lindell, T. Schneider, M. Zohner. More Efficient Oblivious Transfer Extensions. To appear in Journal of Cryptology. Online at http://eprint.iacr.org/2016/602. Communication-Efficient Generic Secure Two-Party Computation. Generic secure two-party computation techniques allow to evaluate a function, represented as a circuit of linear (XOR) and non-linear (AND) gates. One of the most prominent generic secure two-party computation protocols is Yao’s garbled circuits [Yao86], for which many optimizations have been proposed. Shortly after Yao’s protocol, the generic secure protocol by Goldreich-Micali-Wigderson (GMW) [GMW87] was introduced. The GMW protocol requires a large number of OTs and was believed to be less efficient for secure two-party computation than Yao’s protocol [HL10, CHK+12]. We improve the efficiency of the GMW protocol and show that it can outperform Yao’s garbled circuits protocol in settings with low bandwidth. Furthermore, we utilize the flexibility of OT and outline special-purpose constructions that can be used within the GMW protocol and which improve its efficiency even further. Parts of these results are published in: ‱ T. Schneider, M. Zohner. GMW vs. Yao? Efficient Secure Two-Party Computation with Low Depth Circuits. In 17th International Conference on Financial Cryptography and Data Security (FC’13). ‱ D. Demmler, T. Schneider, M. Zohner. ABY - A Framework for Efficient Mixed-Protocol Secure Two-Party Computation. In 22th Network and Distributed System Security Symposium (NDSS’15). ‱ G. Dessouky, F. Koushanfar, A.-R. Sadeghi, T. Schneider, S. Zeitouni, M. Zohner. Pushing the Communication Barrier in Secure Computation using Lookup Tables. In 24th Network and Distributed System Security Symposium (NDSS’17). Faster Private Set Intersection (PSI). PSI allows two parties to compute the intersection of their private sets without revealing any element that is not in the intersection. PSI is a well-studied problem in secure computation and many special-purpose protocols have been proposed. However, existing PSI protocols are several orders of magnitude slower than an insecure naive hashing solution that is used in practice. In addition, many protocols were compared in a biased fashion, which makes it difficult to identify the most promising solution for a particular scenario. We systematize the progress made on PSI protocols by reviewing, optimizing, and comparing existing PSI protocols. We then introduce a novel PSI protocol that is based on our efficiency improvements in OT extension protocols and which outperforms existing protocols by up to two orders of magnitude. Parts of these results are published in: ‱ B. Pinkas, T. Schneider, M. Zohner. Faster Private Set Intersection Based on OT Extension. In 23th USENIX Security Symposium (USENIX Security’14). ‱ B. Pinkas, T. Schneider, G. Segev, M. Zohner. Phasing: Private Set Intersection using Permutation-based Hashing. In 24th USENIX Security Symposium (USENIX Security’15). ‱ B. Pinkas, T. Schneider, M. Zohner. Scalable Private Set Intersection Based on OT Extension. Journal paper. In submission. Online at http://iacr.eprint.org/2016/930

    Distributed pre-computation for a cryptanalytic time-memory trade-off

    Get PDF
    Cryptanalytic tables often play a critical role in decryption efforts for ciphers where the key is not known. Using a cryptanalytic table allows a time-memory tradeoff attack in which disk space or physical memory is traded for a shorter decryption time. For any N key cryptosystem, potential keys are generated and stored in a lookup table, thus reducing the time it takes to perform cryptanalysis of future keys and the space required to store them. The success rate of these lookup tables varies with the size of the key space, but can be calculated based on the number of keys and the length of the chains used within the table. The up-front cost of generating the tables is typically ignored when calculating cryptanalysis time, as the work is assumed to have already been performed. As computers move from 32 bit to 64 bit architectures and as key lengths increase, the time it takes to pre-compute these tables rises exponentially. In some cases, the pre-computation time can no longer be ignored because it becomes infeasible to pre-compute the tables due to the sheer size of the key space. This thesis focuses on parallel techniques for generating pre-computed cryptanalytic tables in a heterogeneous environment and presents a working parallel application that makes use of the Message Passing Interface (MPI). The parallel implementation is designed to divide the workload for pre-computing a single table across multiple heterogeneous nodes with minimal overhead incurred from message passing. The result is an increase in pre-computational speed that is close to that which can be achieved by adding the computational ability of all processors together
    • 

    corecore