119 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Dynamic Dictionary with Subconstant Wasted Bits per Key

    Full text link
    Dictionaries have been one of the central questions in data structures. A dictionary data structure maintains a set of key-value pairs under insertions and deletions such that given a query key, the data structure efficiently returns its value. The state-of-the-art dictionaries [Bender, Farach-Colton, Kuszmaul, Kuszmaul, Liu 2022] store nn key-value pairs with only O(nlog⁥(k)n)O(n \log^{(k)} n) bits of redundancy, and support all operations in O(k)O(k) time, for k≀log⁡∗nk \leq \log^* n. It was recently shown to be optimal [Li, Liang, Yu, Zhou 2023b]. In this paper, we study the regime where the redundant bits is R=o(n)R=o(n), and show that when RR is at least n/polylog⁥nn/\text{poly}\log n, all operations can be supported in O(log⁡∗n+log⁥(n/R))O(\log^* n + \log (n/R)) time, matching the lower bound in this regime [Li, Liang, Yu, Zhou 2023b]. We present two data structures based on which range RR is in. The data structure for R<n/log⁥0.1nR<n/\log^{0.1} n utilizes a generalization of adapters studied in [Berger, Kuszmaul, Polak, Tidor, Wein 2022] and [Li, Liang, Yu, Zhou 2023a]. The data structure for R≄n/log⁥0.1nR \geq n/\log^{0.1} n is based on recursively hashing into buckets with logarithmic sizes.Comment: 46 pages; SODA 202

    A Survey on Malware Detection with Graph Representation Learning

    Full text link
    Malware detection has become a major concern due to the increasing number and complexity of malware. Traditional detection methods based on signatures and heuristics are used for malware detection, but unfortunately, they suffer from poor generalization to unknown attacks and can be easily circumvented using obfuscation techniques. In recent years, Machine Learning (ML) and notably Deep Learning (DL) achieved impressive results in malware detection by learning useful representations from data and have become a solution preferred over traditional methods. More recently, the application of such techniques on graph-structured data has achieved state-of-the-art performance in various domains and demonstrates promising results in learning more robust representations from malware. Yet, no literature review focusing on graph-based deep learning for malware detection exists. In this survey, we provide an in-depth literature review to summarize and unify existing works under the common approaches and architectures. We notably demonstrate that Graph Neural Networks (GNNs) reach competitive results in learning robust embeddings from malware represented as expressive graph structures, leading to an efficient detection by downstream classifiers. This paper also reviews adversarial attacks that are utilized to fool graph-based detection methods. Challenges and future research directions are discussed at the end of the paper.Comment: Preprint, submitted to ACM Computing Surveys on March 2023. For any suggestions or improvements, please contact me directly by e-mai

    Unbalanced Private Set Intersection from Homomorphic Encryption and Nested Cuckoo Hashing

    Get PDF
    Private Set Intersection (PSI) is a well-studied secure two-party computation problem in which a client and a server want to compute the intersection of their input sets without revealing additional information to the other party. With this work, we present nested Cuckoo hashing, a novel hashing approach that can be combined with additively homomorphic encryption (AHE) to construct an efficient PSI protocol for unbalanced input sets. We formally prove the security of our protocol against semi-honest adversaries in the standard model. Our protocol yields client computation and communication complexity that is sublinear in the server’s set size and is thus of interest to clients with limited resources. The implementation and empirical evaluation of our protocol using the exponential ElGamal and BGV/BFV encryption schemes attests to state-of-the-art practical performance

    Persistent Memory File Systems:A Survey

    Get PDF
    Persistent Memory (PM) is non-volatile byte-addressable memory that offers read and write latencies in the order of magnitude smaller than flash storage, such as SSDs. This survey discusses how file systems address the most prominent challenges in the implementation of file systems for Persistent Memory. First, we discuss how the properties of Persistent Memory change file system design. Second, we discuss work that aims to optimize small file I/O and the associated meta-data resolution. Third, we address how existing Persistent Memory file systems achieve (meta) data persistence and consistency

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    Panacea: Non-interactive and Stateless Oblivious RAM

    Get PDF
    Oblivious RAM (ORAM) allows a client to outsource storage to a remote server while hiding the data access pattern from the server. Many ORAM designs have been proposed to reduce the computational overhead and bandwidth blowup for the client. A recent work, Onion Ring ORAM (CCS\u2719), is able to achieve O(1)O(1) bandwidth blowup in the online phase using fully homomorphic encryption (FHE) techniques, at the cost of a computationally expensive client-side offline phase. Furthermore, such a scheme can be categorized as a stateful construction, meaning that the client has to locally maintain a dynamic state representing the order of remote database elements. We present Panacea: a novel design of ORAM based on FHE techniques, that is non-interactive and stateless, achieves O(1)O(1) bandwidth blowup, and does not require an expensive offline phase for the client to perform; in that sense, our design is the first of its kind among other ORAM designs. To provide the client with such performance benefits, our design delegates all expensive computation to the resourceful server. We additionally show how to boost the server performance significantly using probabilistic batch codes at the cost of only 1.5x in additional bandwidth blowup and 3x expansion in server storage, but less amortized bandwidth. Our experimental results show that our design, with the batching technique, is practical in terms of server computation overhead as well. Specifically, for a database size of 2192^{19}, it takes only 1.161.16 seconds of amortized computation time for a server to respond to a query. As a result of the statelessness and low computational overhead on the client, and reasonable computational overhead on the server, our design is very suitable to be deployed as a cloud-based privacy-preserving storage outsourcing solution with a portable client running on a lightweight device

    Cuckoo Hashing in Cryptography: Optimal Parameters, Robustness and Applications

    Get PDF
    Cuckoo hashing is a powerful primitive that enables storing items using small space with efficient querying. At a high level, cuckoo hashing maps nn items into bb entries storing at most ℓ\ell items such that each item is placed into one of kk randomly chosen entries. Additionally, there is an overflow stash that can store at most ss items. Many cryptographic primitives rely upon cuckoo hashing to privately embed and query data where it is integral to ensure small failure probability when constructing cuckoo hashing tables as it directly relates to the privacy guarantees. As our main result, we present a more query-efficient cuckoo hashing construction using more hash functions. For construction failure probability Ï”\epsilon, the query overhead of our scheme is O(1+log⁥(1/Ï”)/log⁥n)O(1 + \sqrt{\log(1/\epsilon)/\log n}). Our scheme has quadratically smaller query overhead than prior works for any target failure probability Ï”\epsilon. We also prove lower bounds matching our construction. Our improvements come from a new understanding of the locality of cuckoo hashing failures for small sets of items. We also initiate the study of robust cuckoo hashing where the input set may be chosen with knowledge of the hash functions. We present a cuckoo hashing scheme using more hash functions with query overhead O~(log⁥λ)\tilde{O}(\log \lambda) that is robust against poly(λ)(\lambda) adversaries. Furthermore, we present lower bounds showing that this construction is tight and that extending previous approaches of large stashes or entries cannot obtain robustness except with Ω(n)\Omega(n) query overhead. As applications of our results, we obtain improved constructions for batch codes and PIR. In particular, we present the most efficient explicit batch code and blackbox reduction from single-query PIR to batch PIR

    Scalable Hash Tables

    Get PDF
    The term scalability with regards to this dissertation has two meanings: It means taking the best possible advantage of the provided resources (both computational and memory resources) and it also means scaling data structures in the literal sense, i.e., growing the capacity, by “rescaling” the table. Scaling well to computational resources implies constructing the fastest best per- forming algorithms and data structures. On today’s many-core machines the best performance is immediately associated with parallelism. Since CPU frequencies have stopped growing about 10-15 years ago, parallelism is the only way to take ad- vantage of growing computational resources. But for data structures in general and hash tables in particular performance is not only linked to faster computations. The most execution time is actually spent waiting for memory. Thus optimizing data structures to reduce the amount of memory accesses or to take better advantage of the memory hierarchy especially through predictable access patterns and prefetch- ing is just as important. In terms of scaling the size of hash tables we have identified three domains where scaling hash-based data structures have been lacking previously, i.e., space effi- cient growing, concurrent hash tables, and Approximate Membership Query data structures (AMQ-filter). Throughout this dissertation, we describe the problems in these areas and develop efficient solutions. We highlight three different libraries that we have developed over the course of this dissertation, each containing mul- tiple implementations that have shown throughout our testing to be among the best implementations in their respective domains. In this composition they offer a comprehensive toolbox that can be used to solve many kinds of hashing related problems or to develop individual solutions for further ones. DySECT is a library for space efficient hash tables specifically growing space effi- cient hash tables that scale with their input size. It contains the namesake DySECT data structure in addition to a number of different probing and cuckoo based im- plementations. Growt is a library for highly efficient concurrent hash tables. It contains a very fast base table and a number of extensions to adapt this table to match any purpose. All extension can be combined to create a variety of different interfaces. In our extensive experimental evaluation, each adaptation has shown to be among the best hash tables for their specific purpose. Lpqfilter is a library for concurrent approximate membership query (AMQ) data structures. It contains some original data structures, like the linear probing quotient filter, as well as some novel approaches to dynamically sized quotient filters
    • 

    corecore