13 research outputs found

    On the Reliability Function of Distributed Hypothesis Testing Under Optimal Detection

    Full text link
    The distributed hypothesis testing problem with full side-information is studied. The trade-off (reliability function) between the two types of error exponents under limited rate is studied in the following way. First, the problem is reduced to the problem of determining the reliability function of channel codes designed for detection (in analogy to a similar result which connects the reliability function of distributed lossless compression and ordinary channel codes). Second, a single-letter random-coding bound based on a hierarchical ensemble, as well as a single-letter expurgated bound, are derived for the reliability of channel-detection codes. Both bounds are derived for a system which employs the optimal detection rule. We conjecture that the resulting random-coding bound is ensemble-tight, and consequently optimal within the class of quantization-and-binning schemes

    A Relationship between Quantization and Distribution Rates of Digitally Fingerprinted Data

    Get PDF
    This paper considers a fingerprinting system where2nRW2^{n R_W} distinct Gaussian fingerprints are embedded inrespective copies of an nn-dimensional i.i.d. Gaussian image.Copies are distributed to customers in digital form, usingRQR_Q bits per image dimension.By means of a coding theorem, a rate regionfor the pair (RQ,RW)(R_Q, R_W) is established such that (i) theaverage quadratic distortion between the original imageand each distributed copy does not exceed a specified level;and (ii) the error probability in decoding the embedded fingerprintin the distributed copy approaches zero asymptotically in nn

    A General Formula for the Mismatch Capacity

    Full text link
    The fundamental limits of channels with mismatched decoding are addressed. A general formula is established for the mismatch capacity of a general channel, defined as a sequence of conditional distributions with a general decoding metrics sequence. We deduce an identity between the Verd\'{u}-Han general channel capacity formula, and the mismatch capacity formula applied to Maximum Likelihood decoding metric. Further, several upper bounds on the capacity are provided, and a simpler expression for a lower bound is derived for the case of a non-negative decoding metric. The general formula is specialized to the case of finite input and output alphabet channels with a type-dependent metric. The closely related problem of threshold mismatched decoding is also studied, and a general expression for the threshold mismatch capacity is obtained. As an example of threshold mismatch capacity, we state a general expression for the erasures-only capacity of the finite input and output alphabet channel. We observe that for every channel there exists a (matched) threshold decoder which is capacity achieving. Additionally, necessary and sufficient conditions are stated for a channel to have a strong converse. Csisz\'{a}r and Narayan's conjecture is proved for bounded metrics, providing a positive answer to the open problem introduced in [1], i.e., that the "product-space" improvement of the lower random coding bound, Cq()(W)C_q^{(\infty)}(W), is indeed the mismatch capacity of the discrete memoryless channel WW. We conclude by presenting an identity between the threshold capacity and Cq()(W)C_q^{(\infty)}(W) in the DMC case

    Digital Watermarking, Fingerprinting and Compression: An Information-Theoretic Perspective

    Get PDF
    The ease with which digital data can be duplicated and distributed over the media and the Internethas raised many concerns about copyright infringement.In many situations, multimedia data (e.g., images, music, movies, etc) are illegally circulated, thus violatingintellectual property rights. In an attempt toovercome this problem, watermarking has been suggestedin the literature as the most effective means for copyright protection and authentication. Watermarking is the procedure whereby information (pertaining to owner and/or copyright) is embedded into host data, such that it is:(i) hidden, i.e., not perceptually visible; and(ii) recoverable, even after a (possibly malicious) degradation of the protected work. In this thesis,we prove some theoretical results that establish the fundamental limits of a general class of watermarking schemes. The main focus of this thesis is the problem ofjoint watermarking and compression of images, whichcan be briefly described as follows: due to bandwidth or storage constraints, a watermarked image is distributed in quantized form, using RQR_Q bits per image dimension, and is subject to some additional degradation (possibly due to malicious attacks). The hidden message carries RWR_W bits per image dimension. Our main result is the determination of the region of allowable rates (RQ,RW)(R_Q, R_W), such that: (i) an average distortion constraint between the original and the watermarked/compressed image is satisfied, and (ii) the hidden message is detected from the degraded image with very high probability. Using notions from information theory, we prove coding theorems that establish the rate regionin the following cases: (a) general i.i.d. image distributions,distortion constraints and memoryless attacks, (b) memoryless attacks combined with collusion (for fingerprinting applications), and (c) general---not necessarily stationary or ergodic---Gaussian image distributions and attacks, and average quadratic distortion constraints. Moreover, we prove a multi-user version of a result by Costa on the capacity of a Gaussian channel with known interference at the encoder

    Making Machines Learn. Applications of Cultural Analytics to the Humanities

    Get PDF
    The digitization of several million books by Google in 2011 meant the popularization of a new kind of humanities research powered by the treatment of cultural objects as data. Culturomics, as it is called, was born, and other initiatives resonated with such a methodological approach, as is the case with the recently formed Digital Humanities or Cultural Analytics. Intrinsically, these new quantitative approaches to culture all borrow from techniques and methods developed under the wing of the exact sciences, such as computer science, machine learning or statistics. There are numerous examples of studies that take advantage of the possibilities that treating objects as data has to offer for the understanding of the human. This new data science that is now applied to the current trends in culture can also be replicated to study more traditional humanities. Led by proper intellectual inquiry, an adequate use of technology may bring answers to questions intractable by other means, or add evidence to long held assumptions based on a canon built from few examples. This dissertation argues in favor of such approach. Three different case studies are considered. First, in the more general sense of the big and smart data, we collected and analyzed more than 120,000 pictures of paintings from all periods of art history, to gain a clear insight on how the beauty of depicted faces, in the framework of neuroscience and evolutionary theory, has changed over time. A second study covers the nuances of modes of emotions employed by the Spanish Golden Age playwright Calderón de la Barca to empathize with his audience. By means of sentiment analysis, a technique strongly supported by machine learning, we shed some light into the different fictional characters, and how they interact and convey messages otherwise invisible to the public. The last case is a study of non-traditional authorship attribution techniques applied to the forefather of the modern novel, the Lazarillo de Tormes. In the end, we conclude that the successful application of cultural analytics and computer science techniques to traditional humanistic endeavours has been enriching and validating
    corecore