228 research outputs found
Distributed Hypothesis Testing with Privacy Constraints
We revisit the distributed hypothesis testing (or hypothesis testing with
communication constraints) problem from the viewpoint of privacy. Instead of
observing the raw data directly, the transmitter observes a sanitized or
randomized version of it. We impose an upper bound on the mutual information
between the raw and randomized data. Under this scenario, the receiver, which
is also provided with side information, is required to make a decision on
whether the null or alternative hypothesis is in effect. We first provide a
general lower bound on the type-II exponent for an arbitrary pair of
hypotheses. Next, we show that if the distribution under the alternative
hypothesis is the product of the marginals of the distribution under the null
(i.e., testing against independence), then the exponent is known exactly.
Moreover, we show that the strong converse property holds. Using ideas from
Euclidean information theory, we also provide an approximate expression for the
exponent when the communication rate is low and the privacy level is high.
Finally, we illustrate our results with a binary and a Gaussian example
Mean Estimation from Adaptive One-bit Measurements
We consider the problem of estimating the mean of a normal distribution under
the following constraint: the estimator can access only a single bit from each
sample from this distribution. We study the squared error risk in this
estimation as a function of the number of samples and one-bit measurements .
We consider an adaptive estimation setting where the single-bit sent at step
is a function of both the new sample and the previous acquired bits.
For this setting, we show that no estimator can attain asymptotic mean squared
error smaller than times the variance. In other words,
one-bit restriction increases the number of samples required for a prescribed
accuracy of estimation by a factor of at least compared to the
unrestricted case. In addition, we provide an explicit estimator that attains
this asymptotic error, showing that, rather surprisingly, only times
more samples are required in order to attain estimation performance equivalent
to the unrestricted case
Distributed Binary Detection with Lossy Data Compression
Consider the problem where a statistician in a two-node system receives
rate-limited information from a transmitter about marginal observations of a
memoryless process generated from two possible distributions. Using its own
observations, this receiver is required to first identify the legitimacy of its
sender by declaring the joint distribution of the process, and then depending
on such authentication it generates the adequate reconstruction of the
observations satisfying an average per-letter distortion. The performance of
this setup is investigated through the corresponding rate-error-distortion
region describing the trade-off between: the communication rate, the error
exponent induced by the detection and the distortion incurred by the source
reconstruction. In the special case of testing against independence, where the
alternative hypothesis implies that the sources are independent, the optimal
rate-error-distortion region is characterized. An application example to binary
symmetric sources is given subsequently and the explicit expression for the
rate-error-distortion region is provided as well. The case of "general
hypotheses" is also investigated. A new achievable rate-error-distortion region
is derived based on the use of non-asymptotic binning, improving the quality of
communicated descriptions. Further improvement of performance in the general
case is shown to be possible when the requirement of source reconstruction is
relaxed, which stands in contrast to the case of general hypotheses.Comment: to appear on IEEE Trans. Information Theor
On the Reliability Function of Distributed Hypothesis Testing Under Optimal Detection
The distributed hypothesis testing problem with full side-information is
studied. The trade-off (reliability function) between the two types of error
exponents under limited rate is studied in the following way. First, the
problem is reduced to the problem of determining the reliability function of
channel codes designed for detection (in analogy to a similar result which
connects the reliability function of distributed lossless compression and
ordinary channel codes). Second, a single-letter random-coding bound based on a
hierarchical ensemble, as well as a single-letter expurgated bound, are derived
for the reliability of channel-detection codes. Both bounds are derived for a
system which employs the optimal detection rule. We conjecture that the
resulting random-coding bound is ensemble-tight, and consequently optimal
within the class of quantization-and-binning schemes
Justification of Logarithmic Loss via the Benefit of Side Information
We consider a natural measure of relevance: the reduction in optimal
prediction risk in the presence of side information. For any given loss
function, this relevance measure captures the benefit of side information for
performing inference on a random variable under this loss function. When such a
measure satisfies a natural data processing property, and the random variable
of interest has alphabet size greater than two, we show that it is uniquely
characterized by the mutual information, and the corresponding loss function
coincides with logarithmic loss. In doing so, our work provides a new
characterization of mutual information, and justifies its use as a measure of
relevance. When the alphabet is binary, we characterize the only admissible
forms the measure of relevance can assume while obeying the specified data
processing property. Our results naturally extend to measuring causal influence
between stochastic processes, where we unify different causal-inference
measures in the literature as instantiations of directed information
- …