48,417 research outputs found
Distributed Hypothesis Testing with Privacy Constraints
We revisit the distributed hypothesis testing (or hypothesis testing with
communication constraints) problem from the viewpoint of privacy. Instead of
observing the raw data directly, the transmitter observes a sanitized or
randomized version of it. We impose an upper bound on the mutual information
between the raw and randomized data. Under this scenario, the receiver, which
is also provided with side information, is required to make a decision on
whether the null or alternative hypothesis is in effect. We first provide a
general lower bound on the type-II exponent for an arbitrary pair of
hypotheses. Next, we show that if the distribution under the alternative
hypothesis is the product of the marginals of the distribution under the null
(i.e., testing against independence), then the exponent is known exactly.
Moreover, we show that the strong converse property holds. Using ideas from
Euclidean information theory, we also provide an approximate expression for the
exponent when the communication rate is low and the privacy level is high.
Finally, we illustrate our results with a binary and a Gaussian example
Encrypted statistical machine learning: new privacy preserving methods
We present two new statistical machine learning methods designed to learn on
fully homomorphic encrypted (FHE) data. The introduction of FHE schemes
following Gentry (2009) opens up the prospect of privacy preserving statistical
machine learning analysis and modelling of encrypted data without compromising
security constraints. We propose tailored algorithms for applying extremely
random forests, involving a new cryptographic stochastic fraction estimator,
and na\"{i}ve Bayes, involving a semi-parametric model for the class decision
boundary, and show how they can be used to learn and predict from encrypted
data. We demonstrate that these techniques perform competitively on a variety
of classification data sets and provide detailed information about the
computational practicalities of these and other FHE methods.Comment: 39 page
Distributed Binary Detection with Lossy Data Compression
Consider the problem where a statistician in a two-node system receives
rate-limited information from a transmitter about marginal observations of a
memoryless process generated from two possible distributions. Using its own
observations, this receiver is required to first identify the legitimacy of its
sender by declaring the joint distribution of the process, and then depending
on such authentication it generates the adequate reconstruction of the
observations satisfying an average per-letter distortion. The performance of
this setup is investigated through the corresponding rate-error-distortion
region describing the trade-off between: the communication rate, the error
exponent induced by the detection and the distortion incurred by the source
reconstruction. In the special case of testing against independence, where the
alternative hypothesis implies that the sources are independent, the optimal
rate-error-distortion region is characterized. An application example to binary
symmetric sources is given subsequently and the explicit expression for the
rate-error-distortion region is provided as well. The case of "general
hypotheses" is also investigated. A new achievable rate-error-distortion region
is derived based on the use of non-asymptotic binning, improving the quality of
communicated descriptions. Further improvement of performance in the general
case is shown to be possible when the requirement of source reconstruction is
relaxed, which stands in contrast to the case of general hypotheses.Comment: to appear on IEEE Trans. Information Theor
Reasoning about Independence in Probabilistic Models of Relational Data
We extend the theory of d-separation to cases in which data instances are not
independent and identically distributed. We show that applying the rules of
d-separation directly to the structure of probabilistic models of relational
data inaccurately infers conditional independence. We introduce relational
d-separation, a theory for deriving conditional independence facts from
relational models. We provide a new representation, the abstract ground graph,
that enables a sound, complete, and computationally efficient method for
answering d-separation queries about relational models, and we present
empirical results that demonstrate effectiveness.Comment: 61 pages, substantial revisions to formalisms, theory, and related
wor
Quantum Cryptography Based Solely on Bell's Theorem
Information-theoretic key agreement is impossible to achieve from scratch and
must be based on some - ultimately physical - premise. In 2005, Barrett, Hardy,
and Kent showed that unconditional security can be obtained in principle based
on the impossibility of faster-than-light signaling; however, their protocol is
inefficient and cannot tolerate any noise. While their key-distribution scheme
uses quantum entanglement, its security only relies on the impossibility of
superluminal signaling, rather than the correctness and completeness of quantum
theory. In particular, the resulting security is device independent. Here we
introduce a new protocol which is efficient in terms of both classical and
quantum communication, and that can tolerate noise in the quantum channel. We
prove that it offers device-independent security under the sole assumption that
certain non-signaling conditions are satisfied. Our main insight is that the
XOR of a number of bits that are partially secret according to the
non-signaling conditions turns out to be highly secret. Note that similar
statements have been well-known in classical contexts. Earlier results had
indicated that amplification of such non-signaling-based privacy is impossible
to achieve if the non-signaling condition only holds between events on Alice's
and Bob's sides. Here, we show that the situation changes completely if such a
separation is given within each of the laboratories.Comment: 32 pages, v2: changed introduction, added reference
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
- …