271 research outputs found

    Average-Case Complexity

    Full text link
    We survey the average-case complexity of problems in NP. We discuss various notions of good-on-average algorithms, and present completeness results due to Impagliazzo and Levin. Such completeness results establish the fact that if a certain specific (but somewhat artificial) NP problem is easy-on-average with respect to the uniform distribution, then all problems in NP are easy-on-average with respect to all samplable distributions. Applying the theory to natural distributional problems remain an outstanding open question. We review some natural distributional problems whose average-case complexity is of particular interest and that do not yet fit into this theory. A major open question whether the existence of hard-on-average problems in NP can be based on the P≠\neqNP assumption or on related worst-case assumptions. We review negative results showing that certain proof techniques cannot prove such a result. While the relation between worst-case and average-case complexity for general NP problems remains open, there has been progress in understanding the relation between different ``degrees'' of average-case complexity. We discuss some of these ``hardness amplification'' results

    On the quantum versus classical learnability of discrete distributions

    Get PDF

    Machine Learning Models that Remember Too Much

    Full text link
    Machine learning (ML) is becoming a commodity. Numerous ML frameworks and services are available to data holders who are not ML experts but want to train predictive models on their data. It is important that ML models trained on sensitive inputs (e.g., personal images or documents) not leak too much information about the training data. We consider a malicious ML provider who supplies model-training code to the data holder, does not observe the training, but then obtains white- or black-box access to the resulting model. In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model. We then explain how the adversary can extract memorized information from the model. We evaluate our techniques on standard ML tasks for image classification (CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20 Newsgroups and IMDB). In all cases, we show how our algorithms create models that have high predictive power yet allow accurate extraction of subsets of their training data

    MLPerf Inference Benchmark

    Full text link
    Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.Comment: ISCA 202

    Delta bloom filter compression using stochastic learning-based weak estimation

    Get PDF
    Substantial research has been done, and sill continues, for reducing the bandwidth requirement and for reliable access to the data, stored and transmitted, in a space efficient manner. Bloom filters and their variants have achieved wide spread acceptability in various fields due to their ability to satisfy these requirements. As this need has increased, especially, for the applications which require heavy use of the transmission bandwidth, distributed computing environment for the databases or the proxy servers, and even the applications which are sensitive to the access to the information with frequent modifications, this thesis proposes a solution in the form of compressed delta Bloom filter. This thesis proposes delta Bloom filter compression, using stochastic learning-based weak estimation and prediction with partial matching to achieve the goal of lossless compression with high compression gain for reducing the large data transferred frequently

    Watermarking Cryptographic Capabilities

    Get PDF
    A watermarking scheme for programs embeds some information called a mark into a program while preserving its functionality. No adversary can remove the mark without damaging the functionality of the program. In this work, we study the problem of watermarking various cryptographic programs such as pseudorandom function (PRF) evaluation, decryption, and signing. For example, given a PRF F, we create a marked program C~ that evaluates F(). An adversary that gets C~ cannot come up with any program C* in which the mark is removed but which still evaluates the PRF correctly on even a small fraction of the inputs. The work of Barak, Goldreich, Impagliazzo, Rudich, Sahai, Vadhan, and Yang (CRYPTO\u2701 and Journal of ACM 59(2)) shows that, assuming indistinguishability obfuscation (iO), such watermarking is impossible if the marked program C~ evaluates the original program with perfect correctness. In this work we show that, assuming iO, such watermarking is possible if the marked program C~ is allowed to err with even a negligible probability, which would be undetectable to the user. Our watermarking schemes are public key, meaning that we use a secret marking key to embed marks in programs, and a public detection key that allows anyone to detect marks in programs. Our schemes are secure against chosen program attacks where the adversary is given oracle access to the marking functionality. We emphasize that our security notion of watermark non-removability considers arbitrary adversarial strategies to modify the marked program, in contrast to the prior works (Nishimaki, EUROCRYPT \u2713)

    On Improving Communication Complexity in Cryptography

    Get PDF
    Cryptography grew to be much more than "the study of secret writing". Modern cryptography is concerned with establishing properties such as privacy, integrity and authenticity in protocols for secure communication and computation. This comes at a price: Cryptographic tools usually introduce an overhead, both in terms of communication complexity (that is, number and size of messages transmitted) and computational efficiency (that is, time and memory required). As in many settings communication between the parties involved is the bottleneck, this thesis is concerned with improving communication complexity in cryptographic protocols. One direction towards this goal is scalable cryptography: In many cryptographic schemes currently deployed, the security degrades linearly with the number of instances (e.g. encrypted messages) in the system. As this number can be huge in contexts like cloud computing, the parameters of the scheme have to be chosen considerably larger - and in particular depending on the expected number of instances in the system - to maintain security guarantees. We advance the state-of-the-art regarding scalable cryptography by constructing schemes where the security guarantees are independent of the number of instances. This allows to choose smaller parameters, even when the expected number of instances is immense. - We construct the first scalable encryption scheme with security against active adversaries which has both compact public keys and ciphertexts. In particular, we significantly reduce the size of the public key to only about 3% of the key-size of the previously most efficient scalable encryption scheme. (Gay,Hofheinz, and Kohl, CRYPTO, 2017) - We present a scalable structure-preserving signature scheme which improves both in terms of public-key and signature size compared to the previously best construction to about 40% and 56% of the sizes, respectively. (Gay, Hofheinz, Kohl, and Pan, EUROCRYPT, 2018) Another important area of cryptography is secure multi-party computation, where the goal is to jointly evaluate some function while keeping each party’s input private. In traditional approaches towards secure multi-party computation either the communication complexity scales linearly in the size of the function, or the computational efficiency is poor. To overcome this issue, Boyle, Gilboa, and Ishai (CRYPTO, 2016) introduced the notion of homomorphic secret sharing. Here, inputs are shared between parties such that each party does not learn anything about the input, and such that the parties can locally evaluate functions on the shares. Homomorphic secret sharing implies secure computation where the communication complexity only depends on the size of the inputs, which is typically much smaller than the size of the function. A different approach towards efficient secure computation is to split the protocol into an input-independent preprocessing phase, where long correlated strings are generated, and a very efficient online phase. One example for a useful correlation are authenticated Beaver triples, which allow to perform efficient multiplications in the online phase such that privacy of the inputs is preserved and parties deviating the protocol can be detected. The currently most efficient protocols implementing the preprocessing phase require communication linear in the number of triples to be generated. This results typically in high communication costs, as the online phase requires at least one authenticated Beaver triple per multiplication. We advance the state-of-the art regarding efficient protocols for secure computation with low communication complexity as follows. - We construct the first homomorphic secret sharing scheme for computing arbitrary functions in NC 1 (that is, functions that are computably by circuits with logarithmic depth) which supports message spaces of arbitrary size, has only negligible correctness error, and does not require expensive multiplication on ciphertexts. (Boyle, Kohl, and Scholl, EUROCRYPT, 2019) - We introduce the notion of a pseudorandom correlation generator for general correlations. Pseudorandom correlation generators allow to locally extend short correlated seeds into long pseudorandom correlated strings. We show that pseudorandom correlation generators can replace the preprocessing phase in many protocols, leading to a preprocessing phase with sublinear communication complexity. We show connections to homomorphic secret sharing schemes and give the first instantiation of pseudorandom correlation generators for authenticated Beaver triples at reasonable computational efficiency. (Boyle, Couteau, Gilboa, Ishai, Kohl, and Scholl, CRYPTO, 2019
    • …
    corecore