Search CORE

14 research outputs found

A Novel Sequential Coreset Method for Gradient Descent Algorithms

Author: Ding Hu
Freris Nikolaos M.
Huang Jiawei
Huang Ruomin
Liu Wenjie
Publication venue
Publication date: 05/12/2021
Field of study

A wide range of optimization problems arising in machine learning can be solved by gradient descent algorithms, and a central question in this area is how to efficiently compress a large-scale dataset so as to reduce the computational complexity. {\em Coreset} is a popular data compression technique that has been extensively studied before. However, most of existing coreset methods are problem-dependent and cannot be used as a general tool for a broader range of applications. A key obstacle is that they often rely on the pseudo-dimension and total sensitivity bound that can be very high or hard to obtain. In this paper, based on the ''locality'' property of gradient descent algorithms, we propose a new framework, termed ''sequential coreset'', which effectively avoids these obstacles. Moreover, our method is particularly suitable for sparse optimization whence the coreset size can be further reduced to be only poly-logarithmically dependent on the dimension. In practice, the experimental results suggest that our method can save a large amount of running time compared with the baseline algorithms

arXiv.org e-Print Archive

Noise Models in Classification: Unified Nomenclature, Extended Taxonomy and Pragmatic Categorization

Author: Sáez Muñoz José Antonio
Publication venue: 'MDPI AG'
Publication date: 11/10/2022
Field of study

This paper presents the first review of noise models in classification covering both label and attribute noise. Their study reveals the lack of a unified nomenclature in this field. In order to address this problem, a tripartite nomenclature based on the structural analysis of existing noise models is proposed. Additionally, a revision of their current taxonomies is carried out, which are combined and updated to better reflect the nature of any model. Finally, a categorization of noise models is proposed from a practical point of view depending on the characteristics of noise and the study purpose. These contributions provide a variety of models to introduce noise, their characteristics according to the proposed taxonomy and a unified way of naming them, which will facilitate their identification and study, as well as the reproducibility of future research

Repositorio Institucional Universidad de Granada

Deep Learning for Inverse Problems: Performance Characterizations, Learning Algorithms, and Applications

Author: Amjad Jaweria
Publication venue: UCL (University College London)
Publication date: 28/03/2022
Field of study

Deep learning models have witnessed immense empirical success over the last decade. However, in spite of their widespread adoption, a profound understanding of the generalization behaviour of these over-parameterized architectures is still missing. In this thesis, we provide one such way via a data-dependent characterizations of the generalization capability of deep neural networks based data representations. In particular, by building on the algorithmic robustness framework, we offer a generalisation error bound that encapsulates key ingredients associated with the learning problem such as the complexity of the data space, the cardinality of the training set, and the Lipschitz properties of a deep neural network. We then specialize our analysis to a specific class of model based regression problems, namely the inverse problems. These problems often come with well defined forward operators that map variables of interest to the observations. It is therefore natural to ask whether such knowledge of the forward operator can be exploited in deep learning approaches increasingly used to solve inverse problems. We offer a generalisation error bound that -- apart from the other factors -- depends on the Jacobian of the composition of the forward operator with the neural network. Motivated by our analysis, we then propose a `plug-and-play' regulariser that leverages the knowledge of the forward map to improve the generalization of the network. We likewise also provide a method allowing us to tightly upper bound the norms of the Jacobians of the relevant operators that is much more {computationally} efficient than existing ones. We demonstrate the efficacy of our model-aware regularised deep learning algorithms against other state-of-the-art approaches on inverse problems involving various sub-sampling operators such as those used in classical compressed sensing setup and inverse problems that are of interest in the biomedical imaging setup

UCL Discovery

Recommended from our members

Structured Tensor Recovery and Decomposition

Author: Mu Cun
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2017
Field of study

Tensors, a.k.a. multi-dimensional arrays, arise naturally when modeling higher-order objects and relations. Among ubiquitous applications including image processing, collaborative filtering, demand forecasting and higher-order statistics, there are two recurring themes in general: tensor recovery and tensor decomposition. The first one aims to recover the underlying tensor from incomplete information; the second one is to study a variety of tensor decompositions to represent the array more concisely and moreover to capture the salient characteristics of the underlying data. Both topics are respectively addressed in this thesis. Chapter 2 and Chapter 3 focus on low-rank tensor recovery (LRTR) from both theoretical and algorithmic perspectives. In Chapter 2, we first provide a negative result to the sum of nuclear norms (SNN) model---an existing convex model widely used for LRTR; then we propose a novel convex model and prove this new model is better than the SNN model in terms of the number of measurements required to recover the underlying low-rank tensor. In Chapter 3, we first build up the connection between robust low-rank tensor recovery and the compressive principle component pursuit (CPCP), a convex model for robust low-rank matrix recovery. Then we focus on developing convergent and scalable optimization methods to solve the CPCP problem. In specific, our convergent method, proposed by combining classical ideas from Frank-Wolfe and proximal methods, achieves scalability with linear per-iteration cost. Chapter 4 generalizes the successive rank-one approximation (SROA) scheme for matrix eigen-decomposition to a special class of tensors called symmetric and orthogonally decomposable (SOD) tensor. We prove that the SROA scheme can robustly recover the symmetric canonical decomposition of the underlying SOD tensor even in the presence of noise. Perturbation bounds, which can be regarded as a higher-order generalization of the Davis-Kahan theorem, are provided in terms of the noise magnitude

Columbia University Academic Commons

Learning with Structured Sparsity: From Discrete to Convex and Back.

Author: El Halabi Marwa
Publication venue
Publication date: 14/06/2018
Field of study

In modern-data analysis applications, the abundance of data makes extracting meaningful information from it challenging, in terms of computation, storage, and interpretability. In this setting, exploiting sparsity in data has been essential to the development of scalable methods to problems in machine learning, statistics and signal processing. However, in various applications, the input variables exhibit structure beyond simple sparsity. This motivated the introduction of structured sparsity models, which capture such sophisticated structures, leading to a significant performance gains and better interpretability. Structured sparse approaches have been successfully applied in a variety of domains including computer vision, text processing, medical imaging, and bioinformatics. The goal of this thesis is to improve on these methods and expand their success to a wider range of applications. We thus develop novel methods to incorporate general structure a priori in learning problems, which balance computational and statistical efficiency trade-offs. To achieve this, our results bring together tools from the rich areas of discrete and convex optimization. Applying structured sparsity approaches in general is challenging because structures encountered in practice are naturally combinatorial. An effective approach to circumvent this computational challenge is to employ continuous convex relaxations. We thus start by introducing a new class of structured sparsity models, able to capture a large range of structures, which admit tight convex relaxations amenable to efficient optimization. We then present an in-depth study of the geometric and statistical properties of convex relaxations of general combinatorial structures. In particular, we characterize which structure is lost by imposing convexity and which is preserved. We then focus on the optimization of the convex composite problems that result from the convex relaxations of structured sparsity models. We develop efficient algorithmic tools to solve these problems in a non-Euclidean setting, leading to faster convergence in some cases. Finally, to handle structures that do not admit meaningful convex relaxations, we propose to use, as a heuristic, a non-convex proximal gradient method, efficient for several classes of structured sparsity models. We further extend this method to address a probabilistic structured sparsity model, we introduce to model approximately sparse signals

Infoscience - École polytechnique fédérale de Lausanne

LIPIcs, Volume 244, ESA 2022, Complete Volume

Author: Chechik Shiri
Herman Grzegorz
Navarro Gonzalo
Rotenberg Eva
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

LIPIcs, Volume 244, ESA 2022, Complete Volum

Dagstuhl Research Online Publication Server

Visual representation learning with deep neural networks under label and budget constraints

Author: Arazo Sánchez Eric
Publication venue: Dublin City University. INSIGHT Centre for Data Analytics
Publication date: 01/02/2022
Field of study

This thesis presents the work done in the area of semi-supervised learning, label noise, and budgeted training for deep learning approaches to computer vision. The improvements seen in computer vision since the successful introduction of deep learning rely on the availability of large amounts of labeled data and long lasting training processes. First, this research studies the three main alternatives to fully supervised deep learning categorized in three different levels of supervision: unsupervised learning (no label involved), semi-supervised learning (a small set of labeled data is available), and label noise (all the samples are labeled but some of them are incorrect). These alternatives aim at reducing the cost of building fully annotated and finely curated datasets, which in most cases is time consuming and requires expert annotators. State-of-the-art performance has been achieved in several semi-supervised, unsupervised, and label noise benchmarks including CIFAR10, CIFAR100, and STL-10. Additionally, the solutions proposed for learning in the presence of label noise have been validated in realistic benchmarks built with datasets annotated from web information: WebVision and Clothing1M. Second, this research explores alternatives to reduce the computational cost of the training of deep learning systems that currently require hours or days to reach state-of-the-art performance. Particularly, this research studied budgeted training, i.e.~when the training process is limited to a fixed number of iterations. Experiments in this setup showed that for better model convergence, variety in the data is preferable than the importance of the samples used during training. As a result of this research, three main author publications have been generated, one more has been recently submitted to review for a conference, and several other secondary author publications have been produced in close collaboration with other researchers in the centre

DCU Online Research Access Service

Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

Author
Publication venue: AUAI Press
Publication date: 01/09/2018
Field of study

UCL Discovery

Towards more reliable machine learning: conceptual insights and practical approaches for unsupervised manifold learning and supervised benchmark studies

Author: Herrmann Moritz
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 28/10/2022
Field of study

Digitale Hochschulschriften der LMU