3 research outputs found

    The benefits of adversarial defense in generalization

    Get PDF
    Recent research has shown that models induced by machine learning, and in particular by deep learning, can be easily fooled by an adversary who carefully crafts imperceptible, at least from the human perspective, or physically plausible modifications of the input data. This discovery gave birth to a new field of research, the adversarial machine learning, where new methods of attacks and defense are developed continuously, mimicking what is happening from a long time in cybersecurity. In this paper we will show that the drawbacks of inducing models from data less prone to be misled can actually provide some benefits when it comes to assessing their generalization abilities. We will show these benefits both from a theoretical perspective, using state-of-the-art statistical learning theory, and both with practical examples

    Definition and learning of logic-based kernels for categorical data, and application to collaborative filtering

    Get PDF
    The continuous pursuit of better prediction quality has gradually led to the development of increasingly complex machine learning models, e.g., deep neural networks. Despite the great success in many domains, the black-box nature of these models makes them not suitable for applications in which the model understanding is at least as important as the prediction accuracy, such as medical applications. On the other hand, more interpretable models, as decision trees, are in general much less accurate. In this thesis, we try to merge the positive aspects of these two realities, by injecting interpretable elements inside complex methods. We focus on kernel methods which have an elegant framework that decouples learning algorithms from data representations. In particular, the first main contribution of this thesis is the proposal of a new family of Boolean kernels, i.e., kernels defined on binary data, with the aim of creating interpretable feature spaces. Assuming binary input vectors, the core idea is to build embedding spaces in which the dimensions represent logical formulas (of a specific form) of the input variables. As a result the solution of a kernel machine can be represented as a weighted sum of logical propositions, and this allows to extract from it human-readable rules. Our framework provides a constructive and efficient way to calculate Boolean kernels of different forms (e.g., disjunctive, conjunctive, DNF, CNF). We show that on binary classification tasks over categorical datasets the proposed kernels achieve state-of-the-art performances. We also provide some theoretical properties about the expressiveness of such kernels. The second main contribution consists in the development of a new multiple kernel learning algorithm to automatically learn the best representation (avoiding the validation). We start from a theoretical result which states that, under mild conditions, any dot-product kernel can be seen as a linear non-negative combination of Boolean conjunctive kernels. Then, from this combination, our MKL algorithm learns non-parametrically the best combination of the conjunctive kernels. This algorithm is designed to optimize the radius-margin ratio of the combined kernel, which has been demonstrated of being an upper bound of the Leave-One-Out error. An extensive empirical evaluation, on several binary classification tasks, shows how our MKL technique is able to outperform state-of-the-art MKL approaches. A third contribution is the proposal of another kernel family for binary input data, which aims to overcome the limitations of the Boolean kernels. In this case the focus is not exclusively on the interpretability, but also on the expressivity. With this new framework, that we dubbed propositional kernel framework, is possible to build kernel functions able to create feature spaces containing almost any kind of logical propositions. Finally, the last contribution is the application of the Boolean kernels to Recommender Systems, specifically, on top-N recommendation tasks. First of all, we propose a novel kernel-based collaborative filtering method and we apply on top of it our Boolean kernels. Empirical results on several collaborative filtering datasets show how less expressive kernels can alleviate the sparsity issue, which is peculiar in this kind of applications

    A local Vapnik-Chervonenkis complexity

    No full text
    We define in this work a new localized version of a Vapnik-Chervonenkis (VC) complexity, namely the Local VC-Entropy, and, building on this new complexity, we derive a new generalization bound for binary classifiers. The Local VC-Entropy-based bound improves on the original Vapnik's results because it is able to discard those functions that, most likely, will not be selected during the learning phase. The result is achieved by applying the localization principle to the original global complexity measure, in the same spirit of the Local Rademacher Complexity. By exploiting and improving a recently developed geometrical framework, we show that it is also possible to relate the Local VC-Entropy to the Local Rademacher Complexity by finding an admissible range for one given the other. In addition, the Local VC-Entropy allows one to reduce the computational requirements that arise when dealing with the Local Rademacher Complexity in binary classification problems
    corecore