532 research outputs found
Gossip Codes for Fingerprinting: Construction, Erasure Analysis and Pirate Tracing
This work presents two new construction techniques for q-ary Gossip codes
from tdesigns and Traceability schemes. These Gossip codes achieve the shortest
code length specified in terms of code parameters and can withstand erasures in
digital fingerprinting applications. This work presents the construction of
embedded Gossip codes for extending an existing Gossip code into a bigger code.
It discusses the construction of concatenated codes and realisation of erasure
model through concatenated codes.Comment: 28 page
Fingerprinting with Minimum Distance Decoding
This work adopts an information theoretic framework for the design of
collusion-resistant coding/decoding schemes for digital fingerprinting. More
specifically, the minimum distance decision rule is used to identify 1 out of t
pirates. Achievable rates, under this detection rule, are characterized in two
distinct scenarios. First, we consider the averaging attack where a random
coding argument is used to show that the rate 1/2 is achievable with t=2
pirates. Our study is then extended to the general case of arbitrary
highlighting the underlying complexity-performance tradeoff. Overall, these
results establish the significant performance gains offered by minimum distance
decoding as compared to other approaches based on orthogonal codes and
correlation detectors. In the second scenario, we characterize the achievable
rates, with minimum distance decoding, under any collusion attack that
satisfies the marking assumption. For t=2 pirates, we show that the rate
is achievable using an ensemble of random linear
codes. For , the existence of a non-resolvable collusion attack, with
minimum distance decoding, for any non-zero rate is established. Inspired by
our theoretical analysis, we then construct coding/decoding schemes for
fingerprinting based on the celebrated Belief-Propagation framework. Using an
explicit repeat-accumulate code, we obtain a vanishingly small probability of
misidentification at rate 1/3 under averaging attack with t=2. For collusion
attacks which satisfy the marking assumption, we use a more sophisticated
accumulate repeat accumulate code to obtain a vanishingly small
misidentification probability at rate 1/9 with t=2. These results represent a
marked improvement over the best available designs in the literature.Comment: 26 pages, 6 figures, submitted to IEEE Transactions on Information
Forensics and Securit
Dynamic Traitor Tracing for Arbitrary Alphabets: Divide and Conquer
We give a generic divide-and-conquer approach for constructing
collusion-resistant probabilistic dynamic traitor tracing schemes with larger
alphabets from schemes with smaller alphabets. This construction offers a
linear tradeoff between the alphabet size and the codelength. In particular, we
show that applying our results to the binary dynamic Tardos scheme of Laarhoven
et al. leads to schemes that are shorter by a factor equal to half the alphabet
size. Asymptotically, these codelengths correspond, up to a constant factor, to
the fingerprinting capacity for static probabilistic schemes. This gives a
hierarchy of probabilistic dynamic traitor tracing schemes, and bridges the gap
between the low bandwidth, high codelength scheme of Laarhoven et al. and the
high bandwidth, low codelength scheme of Fiat and Tassa.Comment: 6 pages, 1 figur
Asymptotically false-positive-maximizing attack on non-binary Tardos codes
We use a method recently introduced by Simone and Skoric to study accusation
probabilities for non-binary Tardos fingerprinting codes. We generalize the
pre-computation steps in this approach to include a broad class of collusion
attack strategies. We analytically derive properties of a special attack that
asymptotically maximizes false accusation probabilities. We present numerical
results on sufficient code lengths for this attack, and explain the abrupt
transitions that occur in these results
Contribution to the construction of fingerprinting and watermarking schemes to protect mobile agents and multimedia content
The main characteristic of fingerprinting codes is the need of high error-correction capacity due to the fact that they are designed to avoid collusion attacks which will damage many symbols from the codewords. Moreover, the use of fingerprinting schemes depends on the watermarking system that is used to embed the codeword into the content and how it honors the marking assumption. In this sense, even though fingerprinting codes were mainly used to protect multimedia content, using them on software protection systems seems an option to be considered.
This thesis, studies how to use codes which have iterative-decoding algorithms, mainly turbo-codes, to solve the fingerprinting problem. Initially, it studies the effectiveness of current approaches based on concatenating tradicioanal fingerprinting schemes with convolutional codes and turbo-codes. It is shown that these kind of constructions ends up generating a high number of false positives. Even though this thesis contains some proposals to improve these schemes, the direct use of turbo-codes without using any concatenation with a fingerprinting code as inner code has also been considered. It is shown that the performance of turbo-codes using the appropiate constituent codes is a valid alternative for environments with hundreds of users and 2 or 3 traitors. As constituent codes, we have chosen low-rate convolutional codes with maximum free distance.
As for how to use fingerprinting codes with watermarking schemes, we have studied the option of using watermarking systems based on informed coding and informed embedding. It has been discovered that, due to different encodings available for the same symbol, its applicability to embed fingerprints is very limited. On this sense, some modifications to these systems have been proposed in order to properly adapt them to fingerprinting applications. Moreover the behavior and impact over a video produced as a collusion of 2 users by the YouTube’s s ervice has been s tudied. We have also studied the optimal parameters for viable tracking of users who have used YouTube and conspired to redistribute copies generated by a collusion attack.
Finally, we have studied how to implement fingerprinting schemes and software watermarking to fix the problem of malicious hosts on mobile agents platforms. In this regard, four different alternatives have been proposed to protect the agent depending on whether you want only detect the attack or avoid it in real time. Two of these proposals are focused on the protection of intrusion detection systems based on mobile agents. Moreover, each of these solutions has several implications in terms of infrastructure and complexity.Els codis fingerprinting es caracteritzen per proveir una alta capacitat correctora ja que han de fer front a atacs de confabulació que malmetran una part important dels sÃmbols de la paraula codi. D'atra banda, la utilització de codis de fingerprinting en entorns reals està subjecta a que l'esquema de watermarking que gestiona la incrustació sigui respectuosa amb la marking assumption. De la mateixa manera, tot i que el fingerprinting neix de la protecció de contingut multimèdia, utilitzar-lo en la protecció de software comença a ser una aplicació a avaluar. En aquesta tesi s'ha estudiat com aplicar codis amb des codificació iterativa, concretament turbo-codis, al problema del rastreig de traïdors en el context del fingerprinting digital. Inicialment s'ha qüestionat l'eficà cia dels enfocaments actuals en la utilització de codis convolucionals i turbo-codis que plantegen concatenacions amb esquemes habituals de fingerprinting. S'ha demostrat que aquest tipus de concatenacions portaven, de forma implÃcita, a una elevada probabilitat d'inculpar un usuari innocent. Tot i que s'han proposat algunes millores sobre aquests esquemes , finalment s'ha plantejat l'ús de turbocodis directament, evitant aixà la concatenació amb altres esquemes de fingerprinting. S'ha demostrat que, si s'utilitzen els codis constituents apropiats, el rendiment del turbo-descodificador és suficient per a ser una alternativa aplicable en entorns amb varis centenars d'usuaris i 2 o 3 confabuladors . Com a codis constituents s'ha optat pels codis convolucionals de baix rà tio amb distà ncia lliure mà xima. Pel que fa a com utilitzar els codis de fingerprinting amb esquemes de watermarking, s'ha estudiat l'opció d'utilitzar sistemes de watermarking basats en la codificació i la incrustació informada. S'ha comprovat que, degut a la múltiple codificació del mateix sÃmbol, la seva aplicabilitat per incrustar fingerprints és molt limitada. En aquest sentit s'ha plantejat algunes modificacions d'aquests sistemes per tal d'adaptar-los correctament a aplicacions de fingerprinting. D'altra banda s'ha avaluat el comportament i l'impacte que el servei de YouTube produeix sobre un vÃdeo amb un fingerprint incrustat. A més , s'ha estudiat els parà metres òptims per a fer viable el rastreig d'usuaris que han confabulat i han utilitzat YouTube per a redistribuir la copia fruït de la seva confabulació. Finalment, s'ha estudiat com aplicar els esquemes de fingerprinting i watermarking de software per solucionar el problema de l'amfitrió maliciós en agents mòbils . En aquest sentit s'han proposat quatre alternatives diferents per a protegir l'agent en funció de si és vol només detectar l'atac o evitar-lo en temps real. Dues d'aquestes propostes es centren en la protecció de sistemes de detecció d'intrusions basats en agents mòbils. Cadascuna de les solucions té diverses implicacions a nivell d'infrastructura i de complexitat.Postprint (published version
Preventing False Discovery in Interactive Data Analysis is Hard
We show that, under a standard hardness assumption, there is no
computationally efficient algorithm that given samples from an unknown
distribution can give valid answers to adaptively chosen
statistical queries. A statistical query asks for the expectation of a
predicate over the underlying distribution, and an answer to a statistical
query is valid if it is "close" to the correct expectation over the
distribution.
Our result stands in stark contrast to the well known fact that exponentially
many statistical queries can be answered validly and efficiently if the queries
are chosen non-adaptively (no query may depend on the answers to previous
queries). Moreover, a recent work by Dwork et al. shows how to accurately
answer exponentially many adaptively chosen statistical queries via a
computationally inefficient algorithm; and how to answer a quadratic number of
adaptive queries via a computationally efficient algorithm. The latter result
implies that our result is tight up to a linear factor in
Conceptually, our result demonstrates that achieving statistical validity
alone can be a source of computational intractability in adaptive settings. For
example, in the modern large collaborative research environment, data analysts
typically choose a particular approach based on previous findings. False
discovery occurs if a research finding is supported by the data but not by the
underlying distribution. While the study of preventing false discovery in
Statistics is decades old, to the best of our knowledge our result is the first
to demonstrate a computational barrier. In particular, our result suggests that
the perceived difficulty of preventing false discovery in today's collaborative
research environment may be inherent
- …