319 research outputs found

    Max-stable sketches: estimation of Lp-norms, dominance norms and point queries for non-negative signals

    Full text link
    Max-stable random sketches can be computed efficiently on fast streaming positive data sets by using only sequential access to the data. They can be used to answer point and Lp-norm queries for the signal. There is an intriguing connection between the so-called p-stable (or sum-stable) and the max-stable sketches. Rigorous performance guarantees through error-probability estimates are derived and the algorithmic implementation is discussed

    Privacy via the Johnson-Lindenstrauss Transform

    Full text link
    Suppose that party A collects private information about its users, where each user's data is represented as a bit vector. Suppose that party B has a proprietary data mining algorithm that requires estimating the distance between users, such as clustering or nearest neighbors. We ask if it is possible for party A to publish some information about each user so that B can estimate the distance between users without being able to infer any private bit of a user. Our method involves projecting each user's representation into a random, lower-dimensional space via a sparse Johnson-Lindenstrauss transform and then adding Gaussian noise to each entry of the lower-dimensional representation. We show that the method preserves differential privacy---where the more privacy is desired, the larger the variance of the Gaussian noise. Further, we show how to approximate the true distances between users via only the lower-dimensional, perturbed data. Finally, we consider other perturbation methods such as randomized response and draw comparisons to sketch-based methods. While the goal of releasing user-specific data to third parties is more broad than preserving distances, this work shows that distance computations with privacy is an achievable goal.Comment: 24 page

    Bloom Filters in Adversarial Environments

    Get PDF
    Many efficient data structures use randomness, allowing them to improve upon deterministic ones. Usually, their efficiency and correctness are analyzed using probabilistic tools under the assumption that the inputs and queries are independent of the internal randomness of the data structure. In this work, we consider data structures in a more robust model, which we call the adversarial model. Roughly speaking, this model allows an adversary to choose inputs and queries adaptively according to previous responses. Specifically, we consider a data structure known as "Bloom filter" and prove a tight connection between Bloom filters in this model and cryptography. A Bloom filter represents a set SS of elements approximately, by using fewer bits than a precise representation. The price for succinctness is allowing some errors: for any xSx \in S it should always answer `Yes', and for any xSx \notin S it should answer `Yes' only with small probability. In the adversarial model, we consider both efficient adversaries (that run in polynomial time) and computationally unbounded adversaries that are only bounded in the number of queries they can make. For computationally bounded adversaries, we show that non-trivial (memory-wise) Bloom filters exist if and only if one-way functions exist. For unbounded adversaries we show that there exists a Bloom filter for sets of size nn and error ε\varepsilon, that is secure against tt queries and uses only O(nlog1ε+t)O(n \log{\frac{1}{\varepsilon}}+t) bits of memory. In comparison, nlog1εn\log{\frac{1}{\varepsilon}} is the best possible under a non-adaptive adversary

    Strong key derivation from noisy sources

    Get PDF
    A shared cryptographic key enables strong authentication. Candidate sources for creating such a shared key include biometrics and physically unclonable functions. However, these sources come with a substantial problem: noise in repeated readings. A fuzzy extractor produces a stable key from a noisy source. It consists of two stages. At enrollment time, the generate algorithm produces a key from an initial reading of the source. At authentication time, the reproduce algorithm takes a repeated but noisy reading of the source, yielding the same key when the two readings are close. For many sources of practical importance, traditional fuzzy extractors provide no meaningful security guarantee. This dissertation improves key derivation from noisy sources. These improvements stem from three observations about traditional fuzzy extractors. First, the only property of a source that standard fuzzy extractors use is the entropy in the original reading. We observe that additional structural information about the source can facilitate key derivation. Second, most fuzzy extractors work by first recovering the initial reading from the noisy reading (known as a secure sketch). This approach imposes harsh limitations on the length of the derived key. We observe that it is possible to produce a consistent key without recovering the original reading of the source. Third, traditional fuzzy extractors provide information-theoretic security. However, security against computationally bounded adversaries is sufficient. We observe fuzzy extractors providing computational security can overcome limitations of traditional approaches. The above observations are supported by negative results and constructions. As an example, we combine all three observations to construct a fuzzy extractor achieving properties that have eluded prior approaches. The construction remains secure even when the initial enrollment phase is repeated multiple times with noisy readings. Furthermore, for many practical sources, reliability demands that the tolerated noise is larger than the entropy of the original reading. The construction provides security for sources of this type by utilizing additional source structure, producing a consistent key without recovering the original reading, and providing computational security

    Functionally Private Approximations of Negligibly-Biased Estimators

    Get PDF
    We study functionally private approximations. An approximation function gg is {em functionally private} with respect to ff if, for any input xx, g(x)g(x) reveals no more information about xx than f(x)f(x). Our main result states that a function ff admits an efficiently-computable functionally private approximation gg if there exists an efficiently-computable and negligibly-biased estimator for ff. Contrary to previous generic results, our theorem is more general and has a wider application reach.We provide two distinct applications of the above result to demonstrate its flexibility. In the data stream model, we provide a functionally private approximation to the LpL_p-norm estimation problem, a quintessential application in streaming, using only polylogarithmic space in the input size. The privacy guarantees rely on the use of pseudo-random {em functions} (PRF) (a stronger cryptographic notion than pseudo-random generators) of which can be based on common cryptographic assumptions.The application of PRFs in this context appears to be novel and we expect other results to follow suit.Moreover, this is the first known functionally private streaming result for {em any} problem. Our second application result states that every problem in some subclasses of SP of hard counting problems admit efficient and functionally private approximation protocols. This result is based on a functionally private approximation for the SDNF problem (or estimating the number of satisfiable truth assignments to a Boolean formula in disjunctive normal form), which is an application of our main theorem and previously known results

    The Flajolet-Martin sketch itself preserves differential privacy: private counting with minimal space

    Full text link
    https://proceedings.neurips.cc/paper/2020/file/e3019767b1b23f82883c9850356b71d6-Paper.pd

    Adaptive learning and cryptography

    Get PDF
    Significant links exist between cryptography and computational learning theory. Cryptographic functions are the usual method of demonstrating significant intractability results in computational learning theory as they can demonstrate that certain problems are hard in a representation independent sense. On the other hand, hard learning problems have been used to create efficient cryptographic protocols such as authentication schemes, pseudo-random permutations and functions, and even public key encryption schemes.;Learning theory / coding theory also impacts cryptography in that it enables cryptographic primitives to deal with the issues of noise or bias in their inputs. Several different constructions of fuzzy primitives exist, a fuzzy primitive being a primitive which functions correctly even in the presence of noisy , or non-uniform inputs. Some examples of these primitives include error-correcting blockciphers, fuzzy identity based cryptosystems, fuzzy extractors and fuzzy sketches. Error correcting blockciphers combine both encryption and error correction in a single function which results in increased efficiency. Fuzzy identity based encryption allows the decryption of any ciphertext that was encrypted under a close enough identity. Fuzzy extractors and sketches are methods of reliably (re)-producing a uniformly random secret key given an imperfectly reproducible string from a biased source, through a public string that is called the sketch .;While hard learning problems have many qualities which make them useful in constructing cryptographic protocols, such as their inherent error tolerance and simple algebraic structure, it is often difficult to utilize them to construct very secure protocols due to assumptions they make on the learning algorithm. Due to these assumptions, the resulting protocols often do not have security against various types of adaptive adversaries. to help deal with this issue, we further examine the inter-relationships between cryptography and learning theory by introducing the concept of adaptive learning . Adaptive learning is a rather weak form of learning in which the learner is not expected to closely approximate the concept function in its entirety, rather it is only expected to answer a query of the learner\u27s choice about the target. Adaptive learning allows for a much weaker learner than in the standard model, while maintaining the the positive properties of many learning problems in the standard model, a fact which we feel makes problems that are hard to adaptively learn more useful than standard model learning problems in the design of cryptographic protocols. We argue that learning parity with noise is hard to do adaptively and use that assumption to construct a related key secure, efficient MAC as well as an efficient authentication scheme. In addition we examine the security properties of fuzzy sketches and extractors and demonstrate how these properties can be combined by using our related key secure MAC. We go on to demonstrate that our extractor can allow a form of related-key hardening for protocols in that, by affecting how the key for a primitive is stored it renders that protocol immune to related key attacks

    Lightweight Techniques for Private Heavy Hitters

    Full text link
    This paper presents a new protocol for solving the private heavy-hitters problem. In this problem, there are many clients and a small set of data-collection servers. Each client holds a private bitstring. The servers want to recover the set of all popular strings, without learning anything else about any client's string. A web-browser vendor, for instance, can use our protocol to figure out which homepages are popular, without learning any user's homepage. We also consider the simpler private subset-histogram problem, in which the servers want to count how many clients hold strings in a particular set without revealing this set to the clients. Our protocols use two data-collection servers and, in a protocol run, each client send sends only a single message to the servers. Our protocols protect client privacy against arbitrary misbehavior by one of the servers and our approach requires no public-key cryptography (except for secure channels), nor general-purpose multiparty computation. Instead, we rely on incremental distributed point functions, a new cryptographic tool that allows a client to succinctly secret-share the labels on the nodes of an exponentially large binary tree, provided that the tree has a single non-zero path. Along the way, we develop new general tools for providing malicious security in applications of distributed point functions. In an experimental evaluation with two servers on opposite sides of the U.S., the servers can find the 200 most popular strings among a set of 400,000 client-held 256-bit strings in 54 minutes. Our protocols are highly parallelizable. We estimate that with 20 physical machines per logical server, our protocols could compute heavy hitters over ten million clients in just over one hour of computation.Comment: To appear in IEEE Security & Privacy 202
    corecore