14 research outputs found

    A Novel Sequential Coreset Method for Gradient Descent Algorithms

    Full text link
    A wide range of optimization problems arising in machine learning can be solved by gradient descent algorithms, and a central question in this area is how to efficiently compress a large-scale dataset so as to reduce the computational complexity. {\em Coreset} is a popular data compression technique that has been extensively studied before. However, most of existing coreset methods are problem-dependent and cannot be used as a general tool for a broader range of applications. A key obstacle is that they often rely on the pseudo-dimension and total sensitivity bound that can be very high or hard to obtain. In this paper, based on the ''locality'' property of gradient descent algorithms, we propose a new framework, termed ''sequential coreset'', which effectively avoids these obstacles. Moreover, our method is particularly suitable for sparse optimization whence the coreset size can be further reduced to be only poly-logarithmically dependent on the dimension. In practice, the experimental results suggest that our method can save a large amount of running time compared with the baseline algorithms

    Noise Models in Classification: Unified Nomenclature, Extended Taxonomy and Pragmatic Categorization

    Get PDF
    This paper presents the first review of noise models in classification covering both label and attribute noise. Their study reveals the lack of a unified nomenclature in this field. In order to address this problem, a tripartite nomenclature based on the structural analysis of existing noise models is proposed. Additionally, a revision of their current taxonomies is carried out, which are combined and updated to better reflect the nature of any model. Finally, a categorization of noise models is proposed from a practical point of view depending on the characteristics of noise and the study purpose. These contributions provide a variety of models to introduce noise, their characteristics according to the proposed taxonomy and a unified way of naming them, which will facilitate their identification and study, as well as the reproducibility of future research

    Deep Learning for Inverse Problems: Performance Characterizations, Learning Algorithms, and Applications

    Get PDF
    Deep learning models have witnessed immense empirical success over the last decade. However, in spite of their widespread adoption, a profound understanding of the generalization behaviour of these over-parameterized architectures is still missing. In this thesis, we provide one such way via a data-dependent characterizations of the generalization capability of deep neural networks based data representations. In particular, by building on the algorithmic robustness framework, we offer a generalisation error bound that encapsulates key ingredients associated with the learning problem such as the complexity of the data space, the cardinality of the training set, and the Lipschitz properties of a deep neural network. We then specialize our analysis to a specific class of model based regression problems, namely the inverse problems. These problems often come with well defined forward operators that map variables of interest to the observations. It is therefore natural to ask whether such knowledge of the forward operator can be exploited in deep learning approaches increasingly used to solve inverse problems. We offer a generalisation error bound that -- apart from the other factors -- depends on the Jacobian of the composition of the forward operator with the neural network. Motivated by our analysis, we then propose a `plug-and-play' regulariser that leverages the knowledge of the forward map to improve the generalization of the network. We likewise also provide a method allowing us to tightly upper bound the norms of the Jacobians of the relevant operators that is much more {computationally} efficient than existing ones. We demonstrate the efficacy of our model-aware regularised deep learning algorithms against other state-of-the-art approaches on inverse problems involving various sub-sampling operators such as those used in classical compressed sensing setup and inverse problems that are of interest in the biomedical imaging setup

    Learning with Structured Sparsity: From Discrete to Convex and Back.

    Get PDF
    In modern-data analysis applications, the abundance of data makes extracting meaningful information from it challenging, in terms of computation, storage, and interpretability. In this setting, exploiting sparsity in data has been essential to the development of scalable methods to problems in machine learning, statistics and signal processing. However, in various applications, the input variables exhibit structure beyond simple sparsity. This motivated the introduction of structured sparsity models, which capture such sophisticated structures, leading to a significant performance gains and better interpretability. Structured sparse approaches have been successfully applied in a variety of domains including computer vision, text processing, medical imaging, and bioinformatics. The goal of this thesis is to improve on these methods and expand their success to a wider range of applications. We thus develop novel methods to incorporate general structure a priori in learning problems, which balance computational and statistical efficiency trade-offs. To achieve this, our results bring together tools from the rich areas of discrete and convex optimization. Applying structured sparsity approaches in general is challenging because structures encountered in practice are naturally combinatorial. An effective approach to circumvent this computational challenge is to employ continuous convex relaxations. We thus start by introducing a new class of structured sparsity models, able to capture a large range of structures, which admit tight convex relaxations amenable to efficient optimization. We then present an in-depth study of the geometric and statistical properties of convex relaxations of general combinatorial structures. In particular, we characterize which structure is lost by imposing convexity and which is preserved. We then focus on the optimization of the convex composite problems that result from the convex relaxations of structured sparsity models. We develop efficient algorithmic tools to solve these problems in a non-Euclidean setting, leading to faster convergence in some cases. Finally, to handle structures that do not admit meaningful convex relaxations, we propose to use, as a heuristic, a non-convex proximal gradient method, efficient for several classes of structured sparsity models. We further extend this method to address a probabilistic structured sparsity model, we introduce to model approximately sparse signals

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum

    Visual representation learning with deep neural networks under label and budget constraints

    Get PDF
    This thesis presents the work done in the area of semi-supervised learning, label noise, and budgeted training for deep learning approaches to computer vision. The improvements seen in computer vision since the successful introduction of deep learning rely on the availability of large amounts of labeled data and long lasting training processes. First, this research studies the three main alternatives to fully supervised deep learning categorized in three different levels of supervision: unsupervised learning (no label involved), semi-supervised learning (a small set of labeled data is available), and label noise (all the samples are labeled but some of them are incorrect). These alternatives aim at reducing the cost of building fully annotated and finely curated datasets, which in most cases is time consuming and requires expert annotators. State-of-the-art performance has been achieved in several semi-supervised, unsupervised, and label noise benchmarks including CIFAR10, CIFAR100, and STL-10. Additionally, the solutions proposed for learning in the presence of label noise have been validated in realistic benchmarks built with datasets annotated from web information: WebVision and Clothing1M. Second, this research explores alternatives to reduce the computational cost of the training of deep learning systems that currently require hours or days to reach state-of-the-art performance. Particularly, this research studied budgeted training, i.e.~when the training process is limited to a fixed number of iterations. Experiments in this setup showed that for better model convergence, variety in the data is preferable than the importance of the samples used during training. As a result of this research, three main author publications have been generated, one more has been recently submitted to review for a conference, and several other secondary author publications have been produced in close collaboration with other researchers in the centre

    Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

    Get PDF
    corecore