202 research outputs found
On the Saddle-point Solution and the Large-Coalition Asymptotics of Fingerprinting Games
We study a fingerprinting game in which the number of colluders and the
collusion channel are unknown. The encoder embeds fingerprints into a host
sequence and provides the decoder with the capability to trace back pirated
copies to the colluders.
Fingerprinting capacity has recently been derived as the limit value of a
sequence of maximin games with mutual information as their payoff functions.
However, these games generally do not admit saddle-point solutions and are very
hard to solve numerically. Here under the so-called Boneh-Shaw marking
assumption, we reformulate the capacity as the value of a single two-person
zero-sum game, and show that it is achieved by a saddle-point solution.
If the maximal coalition size is k and the fingerprinting alphabet is binary,
we show that capacity decays quadratically with k. Furthermore, we prove
rigorously that the asymptotic capacity is 1/(k^2 2ln2) and we confirm our
earlier conjecture that Tardos' choice of the arcsine distribution
asymptotically maximizes the mutual information payoff function while the
interleaving attack minimizes it. Along with the asymptotic behavior, numerical
solutions to the game for small k are also presented.Comment: submitted to IEEE Trans. on Information Forensics and Securit
Gossip Codes for Fingerprinting: Construction, Erasure Analysis and Pirate Tracing
This work presents two new construction techniques for q-ary Gossip codes
from tdesigns and Traceability schemes. These Gossip codes achieve the shortest
code length specified in terms of code parameters and can withstand erasures in
digital fingerprinting applications. This work presents the construction of
embedded Gossip codes for extending an existing Gossip code into a bigger code.
It discusses the construction of concatenated codes and realisation of erasure
model through concatenated codes.Comment: 28 page
Enhanced blind decoding of Tardos codes with new map-based functions
This paper presents a new decoder for probabilistic binary traitor tracing
codes under the marking assumption. It is based on a binary hypothesis testing
rule which integrates a collusion channel relaxation so as to obtain numerical
and simple accusation functions. This decoder is blind as no estimation of the
collusion channel prior to the accusation is required. Experimentations show
that using the proposed decoder gives better performance than the well-known
symmetric version of the Tardos decoder for common attack channels
Asymptotics of Fingerprinting and Group Testing: Tight Bounds from Channel Capacities
In this work we consider the large-coalition asymptotics of various
fingerprinting and group testing games, and derive explicit expressions for the
capacities for each of these models. We do this both for simple decoders (fast
but suboptimal) and for joint decoders (slow but optimal).
For fingerprinting, we show that if the pirate strategy is known, the
capacity often decreases linearly with the number of colluders, instead of
quadratically as in the uninformed fingerprinting game. For many attacks the
joint capacity is further shown to be strictly higher than the simple capacity.
For group testing, we improve upon known results about the joint capacities,
and derive new explicit asymptotics for the simple capacities. These show that
existing simple group testing algorithms are suboptimal, and that simple
decoders cannot asymptotically be as efficient as joint decoders. For the
traditional group testing model, we show that the gap between the simple and
joint capacities is a factor 1.44 for large numbers of defectives.Comment: 14 pages, 6 figure
Preventing False Discovery in Interactive Data Analysis is Hard
We show that, under a standard hardness assumption, there is no
computationally efficient algorithm that given samples from an unknown
distribution can give valid answers to adaptively chosen
statistical queries. A statistical query asks for the expectation of a
predicate over the underlying distribution, and an answer to a statistical
query is valid if it is "close" to the correct expectation over the
distribution.
Our result stands in stark contrast to the well known fact that exponentially
many statistical queries can be answered validly and efficiently if the queries
are chosen non-adaptively (no query may depend on the answers to previous
queries). Moreover, a recent work by Dwork et al. shows how to accurately
answer exponentially many adaptively chosen statistical queries via a
computationally inefficient algorithm; and how to answer a quadratic number of
adaptive queries via a computationally efficient algorithm. The latter result
implies that our result is tight up to a linear factor in
Conceptually, our result demonstrates that achieving statistical validity
alone can be a source of computational intractability in adaptive settings. For
example, in the modern large collaborative research environment, data analysts
typically choose a particular approach based on previous findings. False
discovery occurs if a research finding is supported by the data but not by the
underlying distribution. While the study of preventing false discovery in
Statistics is decades old, to the best of our knowledge our result is the first
to demonstrate a computational barrier. In particular, our result suggests that
the perceived difficulty of preventing false discovery in today's collaborative
research environment may be inherent
- …