48,417 research outputs found

    Distributed Hypothesis Testing with Privacy Constraints

    Full text link
    We revisit the distributed hypothesis testing (or hypothesis testing with communication constraints) problem from the viewpoint of privacy. Instead of observing the raw data directly, the transmitter observes a sanitized or randomized version of it. We impose an upper bound on the mutual information between the raw and randomized data. Under this scenario, the receiver, which is also provided with side information, is required to make a decision on whether the null or alternative hypothesis is in effect. We first provide a general lower bound on the type-II exponent for an arbitrary pair of hypotheses. Next, we show that if the distribution under the alternative hypothesis is the product of the marginals of the distribution under the null (i.e., testing against independence), then the exponent is known exactly. Moreover, we show that the strong converse property holds. Using ideas from Euclidean information theory, we also provide an approximate expression for the exponent when the communication rate is low and the privacy level is high. Finally, we illustrate our results with a binary and a Gaussian example

    Encrypted statistical machine learning: new privacy preserving methods

    Full text link
    We present two new statistical machine learning methods designed to learn on fully homomorphic encrypted (FHE) data. The introduction of FHE schemes following Gentry (2009) opens up the prospect of privacy preserving statistical machine learning analysis and modelling of encrypted data without compromising security constraints. We propose tailored algorithms for applying extremely random forests, involving a new cryptographic stochastic fraction estimator, and na\"{i}ve Bayes, involving a semi-parametric model for the class decision boundary, and show how they can be used to learn and predict from encrypted data. We demonstrate that these techniques perform competitively on a variety of classification data sets and provide detailed information about the computational practicalities of these and other FHE methods.Comment: 39 page

    Distributed Binary Detection with Lossy Data Compression

    Full text link
    Consider the problem where a statistician in a two-node system receives rate-limited information from a transmitter about marginal observations of a memoryless process generated from two possible distributions. Using its own observations, this receiver is required to first identify the legitimacy of its sender by declaring the joint distribution of the process, and then depending on such authentication it generates the adequate reconstruction of the observations satisfying an average per-letter distortion. The performance of this setup is investigated through the corresponding rate-error-distortion region describing the trade-off between: the communication rate, the error exponent induced by the detection and the distortion incurred by the source reconstruction. In the special case of testing against independence, where the alternative hypothesis implies that the sources are independent, the optimal rate-error-distortion region is characterized. An application example to binary symmetric sources is given subsequently and the explicit expression for the rate-error-distortion region is provided as well. The case of "general hypotheses" is also investigated. A new achievable rate-error-distortion region is derived based on the use of non-asymptotic binning, improving the quality of communicated descriptions. Further improvement of performance in the general case is shown to be possible when the requirement of source reconstruction is relaxed, which stands in contrast to the case of general hypotheses.Comment: to appear on IEEE Trans. Information Theor

    Reasoning about Independence in Probabilistic Models of Relational Data

    Full text link
    We extend the theory of d-separation to cases in which data instances are not independent and identically distributed. We show that applying the rules of d-separation directly to the structure of probabilistic models of relational data inaccurately infers conditional independence. We introduce relational d-separation, a theory for deriving conditional independence facts from relational models. We provide a new representation, the abstract ground graph, that enables a sound, complete, and computationally efficient method for answering d-separation queries about relational models, and we present empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related wor

    Quantum Cryptography Based Solely on Bell's Theorem

    Full text link
    Information-theoretic key agreement is impossible to achieve from scratch and must be based on some - ultimately physical - premise. In 2005, Barrett, Hardy, and Kent showed that unconditional security can be obtained in principle based on the impossibility of faster-than-light signaling; however, their protocol is inefficient and cannot tolerate any noise. While their key-distribution scheme uses quantum entanglement, its security only relies on the impossibility of superluminal signaling, rather than the correctness and completeness of quantum theory. In particular, the resulting security is device independent. Here we introduce a new protocol which is efficient in terms of both classical and quantum communication, and that can tolerate noise in the quantum channel. We prove that it offers device-independent security under the sole assumption that certain non-signaling conditions are satisfied. Our main insight is that the XOR of a number of bits that are partially secret according to the non-signaling conditions turns out to be highly secret. Note that similar statements have been well-known in classical contexts. Earlier results had indicated that amplification of such non-signaling-based privacy is impossible to achieve if the non-signaling condition only holds between events on Alice's and Bob's sides. Here, we show that the situation changes completely if such a separation is given within each of the laboratories.Comment: 32 pages, v2: changed introduction, added reference

    A Survey of Symbolic Execution Techniques

    Get PDF
    Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience. The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fv
    corecore