38 research outputs found
GAIT: A Geometric Approach to Information Theory
We advocate the use of a notion of entropy that reflects the relative
abundances of the symbols in an alphabet, as well as the similarities between
them. This concept was originally introduced in theoretical ecology to study
the diversity of ecosystems. Based on this notion of entropy, we introduce
geometry-aware counterparts for several concepts and theorems in information
theory. Notably, our proposed divergence exhibits performance on par with
state-of-the-art methods based on the Wasserstein distance, but enjoys a
closed-form expression that can be computed efficiently. We demonstrate the
versatility of our method via experiments on a broad range of domains: training
generative models, computing image barycenters, approximating empirical
measures and counting modes.Comment: Replaces the previous version named "GEAR: Geometry-Aware R\'enyi
Information
Generalized Entropies and Metric-Invariant Optimal Countermeasures for Information Leakage Under Symmetric Constraints
One again, tuition has risen at the College. However, students believe that it is higher overall than the nationwide jump which recently occurred. Both the students and staff of the College are currently dissatisfied with the library. They believe that its system of numbering should be switched over to something more modern. The funding of campus groups is looked at by the administration. William Darr, from Earlham College, will be appearing at Wooster to display his Japanese prints. Wooster recently beat Wesleyan in basketball, and hopes to go on to a championshiphttps://openworks.wooster.edu/voice1961-1970/1100/thumbnail.jp
Towards private and robust machine learning for information security
Many problems in information security are pattern recognition problems. For example, determining if a digital communication can be trusted amounts to certifying that the communication does not carry malicious or secret content, which can be distilled into the problem of recognising the difference between benign and malicious content. At a high level, machine learning is the study of how patterns are formed within data, and how learning these patterns generalises beyond the potentially limited data pool at a practitioner’s disposal, and so has become a powerful tool in information security. In this work, we study the benefits machine learning can bring to two problems in information security. Firstly, we show that machine learning can be used to detect which websites are visited by an internet user over an encrypted connection. By analysing timing and packet size information of encrypted network traffic, we train a machine learning model that predicts the target website given a stream of encrypted network traffic, even if browsing is performed over an anonymous communication network. Secondly, in addition to studying how machine learning can be used to design attacks, we study how it can be used to solve the problem of hiding information within a cover medium, such as an image or an audio recording, which is commonly referred to as steganography. How well an algorithm can hide information within a cover medium amounts to how well the algorithm models and exploits areas of redundancy. This can again be reduced to a pattern recognition problem, and so we apply machine learning to design a steganographic algorithm that efficiently hides a secret message with an image. Following this, we proceed with discussions surrounding why machine learning is not a panacea for information security, and can be an attack vector in and of itself. We show that machine learning can leak private and sensitive information about the data it used to learn, and how malicious actors can exploit vulnerabilities in these learning algorithms to compel them to exhibit adversarial behaviours. Finally, we examine the problem of the disconnect between image recognition systems learned by humans and by machine learning models. While human classification of an image is relatively robust to noise, machine learning models do not possess this property. We show how an attacker can cause targeted misclassifications against an entire data distribution by exploiting this property, and go onto introduce a mitigation that ameliorates this undesirable trait of machine learning
Generalized Entropies and Metric-Invariant Optimal Countermeasures for Information Leakage Under Symmetric Constraints
We introduce a novel generalization of entropy and conditional entropy from
which most definitions from the literature can be derived as particular cases.
Within this general framework, we investigate the problem of designing
countermeasures for information leakage. In particular, we seek
metric-invariant solutions, i.e., they are robust against the choice of entropy
for quantifying the leakage. The problem can be modelled as an information
channel from the system to an adversary, and the countermeasures can be seen as
modifying this channel in order to minimise the amount of information that the
outputs reveal about the inputs. Our main result is to fully solve the problem
under the highly symmetrical design constraint that the number of inputs that
can produce the same output is capped. Our proof is constructive and the
optimal channels and the minimum leakage are derived in closed form.Comment: Accepted to IEEE Transactions on Information Theory, in November 201
Information Theory and Machine Learning
The recent successes of machine learning, especially regarding systems based on deep neural networks, have encouraged further research activities and raised a new set of challenges in understanding and designing complex machine learning algorithms. New applications require learning algorithms to be distributed, have transferable learning results, use computation resources efficiently, convergence quickly on online settings, have performance guarantees, satisfy fairness or privacy constraints, incorporate domain knowledge on model structures, etc. A new wave of developments in statistical learning theory and information theory has set out to address these challenges. This Special Issue, "Machine Learning and Information Theory", aims to collect recent results in this direction reflecting a diverse spectrum of visions and efforts to extend conventional theories and develop analysis tools for these complex machine learning systems