490 research outputs found
Covariate assisted screening and estimation
Consider a linear model , where and .
The vector is unknown but is sparse in the sense that most of its
coordinates are . The main interest is to separate its nonzero coordinates
from the zero ones (i.e., variable selection). Motivated by examples in
long-memory time series (Fan and Yao [Nonlinear Time Series: Nonparametric and
Parametric Methods (2003) Springer]) and the change-point problem (Bhattacharya
[In Change-Point Problems (South Hadley, MA, 1992) (1994) 28-56 IMS]), we are
primarily interested in the case where the Gram matrix is nonsparse but
sparsifiable by a finite order linear filter. We focus on the regime where
signals are both rare and weak so that successful variable selection is very
challenging but is still possible. We approach this problem by a new procedure
called the covariate assisted screening and estimation (CASE). CASE first uses
a linear filtering to reduce the original setting to a new regression model
where the corresponding Gram (covariance) matrix is sparse. The new covariance
matrix induces a sparse graph, which guides us to conduct multivariate
screening without visiting all the submodels. By interacting with the signal
sparsity, the graph enables us to decompose the original problem into many
separated small-size subproblems (if only we know where they are!). Linear
filtering also induces a so-called problem of information leakage, which can be
overcome by the newly introduced patching technique. Together, these give rise
to CASE, which is a two-stage screen and clean [Fan and Song Ann. Statist. 38
(2010) 3567-3604; Wasserman and Roeder Ann. Statist. 37 (2009) 2178-2201]
procedure, where we first identify candidates of these submodels by patching
and screening, and then re-examine each candidate to remove false positives.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1243 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Upper Bounds on the Capacity of Binary Channels with Causal Adversaries
In this work we consider the communication of information in the presence of
a causal adversarial jammer. In the setting under study, a sender wishes to
communicate a message to a receiver by transmitting a codeword
bit-by-bit over a communication channel. The sender and the receiver do not
share common randomness. The adversarial jammer can view the transmitted bits
one at a time, and can change up to a -fraction of them. However, the
decisions of the jammer must be made in a causal manner. Namely, for each bit
the jammer's decision on whether to corrupt it or not must depend only on
for . This is in contrast to the "classical" adversarial
jamming situations in which the jammer has no knowledge of , or
knows completely. In this work, we present upper bounds (that
hold under both the average and maximal probability of error criteria) on the
capacity which hold for both deterministic and stochastic encoding schemes.Comment: To appear in the IEEE Transactions on Information Theory; shortened
version appeared at ISIT 201
Tight Bounds on List-Decodable and List-Recoverable Zero-Rate Codes
In this work, we consider the list-decodability and list-recoverability of
codes in the zero-rate regime. Briefly, a code is
-list-recoverable if for all tuples of input lists
with each and the number of
codewords such that for at most
choices of is less than ; list-decoding is the special case of
. In recent work by Resch, Yuan and Zhang~(ICALP~2023) the zero-rate
threshold for list-recovery was determined for all parameters: that is, the
work explicitly computes with the property that for all
(a) there exist infinite families positive-rate
-list-recoverable codes, and (b) any
-list-recoverable code has rate . In fact, in the
latter case the code has constant size, independent on . However, the
constant size in their work is quite large in , at least
.
Our contribution in this work is to show that for all choices of and
with , any -list-recoverable code must
have size , and furthermore this upper bound is
complemented by a matching lower bound . This
greatly generalizes work by Alon, Bukh and Polyanskiy~(IEEE Trans.\ Inf.\
Theory~2018) which focused only on the case of binary alphabet (and thus
necessarily only list-decoding). We remark that we can in fact recover the same
result for and even , as obtained by Alon, Bukh and Polyanskiy: we
thus strictly generalize their work.Comment: Abstract shortened to meet the arXiv requiremen
On the Measurement of Privacy as an Attacker's Estimation Error
A wide variety of privacy metrics have been proposed in the literature to
evaluate the level of protection offered by privacy enhancing-technologies.
Most of these metrics are specific to concrete systems and adversarial models,
and are difficult to generalize or translate to other contexts. Furthermore, a
better understanding of the relationships between the different privacy metrics
is needed to enable more grounded and systematic approach to measuring privacy,
as well as to assist systems designers in selecting the most appropriate metric
for a given application.
In this work we propose a theoretical framework for privacy-preserving
systems, endowed with a general definition of privacy in terms of the
estimation error incurred by an attacker who aims to disclose the private
information that the system is designed to conceal. We show that our framework
permits interpreting and comparing a number of well-known metrics under a
common perspective. The arguments behind these interpretations are based on
fundamental results related to the theories of information, probability and
Bayes decision.Comment: This paper has 18 pages and 17 figure
Quickest Sequence Phase Detection
A phase detection sequence is a length- cyclic sequence, such that the
location of any length- contiguous subsequence can be determined from a
noisy observation of that subsequence. In this paper, we derive bounds on the
minimal possible in the limit of , and describe some sequence
constructions. We further consider multiple phase detection sequences, where
the location of any length- contiguous subsequence of each sequence can be
determined simultaneously from a noisy mixture of those subsequences. We study
the optimal trade-offs between the lengths of the sequences, and describe some
sequence constructions. We compare these phase detection problems to their
natural channel coding counterparts, and show a strict separation between the
fundamental limits in the multiple sequence case. Both adversarial and
probabilistic noise models are addressed.Comment: To appear in the IEEE Transactions on Information Theor
Secure Multiterminal Source Coding with Side Information at the Eavesdropper
The problem of secure multiterminal source coding with side information at
the eavesdropper is investigated. This scenario consists of a main encoder
(referred to as Alice) that wishes to compress a single source but
simultaneously satisfying the desired requirements on the distortion level at a
legitimate receiver (referred to as Bob) and the equivocation rate --average
uncertainty-- at an eavesdropper (referred to as Eve). It is further assumed
the presence of a (public) rate-limited link between Alice and Bob. In this
setting, Eve perfectly observes the information bits sent by Alice to Bob and
has also access to a correlated source which can be used as side information. A
second encoder (referred to as Charlie) helps Bob in estimating Alice's source
by sending a compressed version of its own correlated observation via a
(private) rate-limited link, which is only observed by Bob. For instance, the
problem at hands can be seen as the unification between the Berger-Tung and the
secure source coding setups. Inner and outer bounds on the so called
rates-distortion-equivocation region are derived. The inner region turns to be
tight for two cases: (i) uncoded side information at Bob and (ii) lossless
reconstruction of both sources at Bob --secure distributed lossless
compression. Application examples to secure lossy source coding of Gaussian and
binary sources in the presence of Gaussian and binary/ternary (resp.) side
informations are also considered. Optimal coding schemes are characterized for
some cases of interest where the statistical differences between the side
information at the decoders and the presence of a non-zero distortion at Bob
can be fully exploited to guarantee secrecy.Comment: 26 pages, 16 figures, 2 table
- …