38 research outputs found

    GAIT: A Geometric Approach to Information Theory

    Full text link
    We advocate the use of a notion of entropy that reflects the relative abundances of the symbols in an alphabet, as well as the similarities between them. This concept was originally introduced in theoretical ecology to study the diversity of ecosystems. Based on this notion of entropy, we introduce geometry-aware counterparts for several concepts and theorems in information theory. Notably, our proposed divergence exhibits performance on par with state-of-the-art methods based on the Wasserstein distance, but enjoys a closed-form expression that can be computed efficiently. We demonstrate the versatility of our method via experiments on a broad range of domains: training generative models, computing image barycenters, approximating empirical measures and counting modes.Comment: Replaces the previous version named "GEAR: Geometry-Aware R\'enyi Information

    Generalized Entropies and Metric-Invariant Optimal Countermeasures for Information Leakage Under Symmetric Constraints

    Get PDF
    One again, tuition has risen at the College. However, students believe that it is higher overall than the nationwide jump which recently occurred. Both the students and staff of the College are currently dissatisfied with the library. They believe that its system of numbering should be switched over to something more modern. The funding of campus groups is looked at by the administration. William Darr, from Earlham College, will be appearing at Wooster to display his Japanese prints. Wooster recently beat Wesleyan in basketball, and hopes to go on to a championshiphttps://openworks.wooster.edu/voice1961-1970/1100/thumbnail.jp

    Towards private and robust machine learning for information security

    Get PDF
    Many problems in information security are pattern recognition problems. For example, determining if a digital communication can be trusted amounts to certifying that the communication does not carry malicious or secret content, which can be distilled into the problem of recognising the difference between benign and malicious content. At a high level, machine learning is the study of how patterns are formed within data, and how learning these patterns generalises beyond the potentially limited data pool at a practitioner’s disposal, and so has become a powerful tool in information security. In this work, we study the benefits machine learning can bring to two problems in information security. Firstly, we show that machine learning can be used to detect which websites are visited by an internet user over an encrypted connection. By analysing timing and packet size information of encrypted network traffic, we train a machine learning model that predicts the target website given a stream of encrypted network traffic, even if browsing is performed over an anonymous communication network. Secondly, in addition to studying how machine learning can be used to design attacks, we study how it can be used to solve the problem of hiding information within a cover medium, such as an image or an audio recording, which is commonly referred to as steganography. How well an algorithm can hide information within a cover medium amounts to how well the algorithm models and exploits areas of redundancy. This can again be reduced to a pattern recognition problem, and so we apply machine learning to design a steganographic algorithm that efficiently hides a secret message with an image. Following this, we proceed with discussions surrounding why machine learning is not a panacea for information security, and can be an attack vector in and of itself. We show that machine learning can leak private and sensitive information about the data it used to learn, and how malicious actors can exploit vulnerabilities in these learning algorithms to compel them to exhibit adversarial behaviours. Finally, we examine the problem of the disconnect between image recognition systems learned by humans and by machine learning models. While human classification of an image is relatively robust to noise, machine learning models do not possess this property. We show how an attacker can cause targeted misclassifications against an entire data distribution by exploiting this property, and go onto introduce a mitigation that ameliorates this undesirable trait of machine learning

    Generalized Entropies and Metric-Invariant Optimal Countermeasures for Information Leakage Under Symmetric Constraints

    Get PDF
    We introduce a novel generalization of entropy and conditional entropy from which most definitions from the literature can be derived as particular cases. Within this general framework, we investigate the problem of designing countermeasures for information leakage. In particular, we seek metric-invariant solutions, i.e., they are robust against the choice of entropy for quantifying the leakage. The problem can be modelled as an information channel from the system to an adversary, and the countermeasures can be seen as modifying this channel in order to minimise the amount of information that the outputs reveal about the inputs. Our main result is to fully solve the problem under the highly symmetrical design constraint that the number of inputs that can produce the same output is capped. Our proof is constructive and the optimal channels and the minimum leakage are derived in closed form.Comment: Accepted to IEEE Transactions on Information Theory, in November 201

    Information Theory and Machine Learning

    Get PDF
    The recent successes of machine learning, especially regarding systems based on deep neural networks, have encouraged further research activities and raised a new set of challenges in understanding and designing complex machine learning algorithms. New applications require learning algorithms to be distributed, have transferable learning results, use computation resources efficiently, convergence quickly on online settings, have performance guarantees, satisfy fairness or privacy constraints, incorporate domain knowledge on model structures, etc. A new wave of developments in statistical learning theory and information theory has set out to address these challenges. This Special Issue, "Machine Learning and Information Theory", aims to collect recent results in this direction reflecting a diverse spectrum of visions and efforts to extend conventional theories and develop analysis tools for these complex machine learning systems
    corecore