Search CORE

30 research outputs found

Nonlinear stepsize control, trust regions and regularizations for unconstrained optimization

Author: Toint Philippe
Publication venue: 'Informa UK Limited'
Publication date: 01/02/2013
Field of study

On an adaptive regularization for ill-posed nonlinear systems and its trust-region implementation

Author: Bellavia Stefania
Morini Benedetta
Riccietti Elisa
Publication venue
Publication date: 01/01/2015
Field of study

In this paper we address the stable numerical solution of nonlinear ill-posed systems by a trust-region method. We show that an appropriate choice of the trust-region radius gives rise to a procedure that has the potential to approach a solution of the unperturbed system. This regularizing property is shown theoretically and validated numerically.Comment: arXiv admin note: text overlap with arXiv:1410.278

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

A Survey of Geometric Optimization for Deep Learning: From Euclidean Space to Riemannian Manifold

Author: Chen Mingsong
Fei Yanhong
Li Zhengyu
Liu Yingjie
Wei Xian
Publication venue
Publication date: 16/02/2023
Field of study

Although Deep Learning (DL) has achieved success in complex Artificial Intelligence (AI) tasks, it suffers from various notorious problems (e.g., feature redundancy, and vanishing or exploding gradients), since updating parameters in Euclidean space cannot fully exploit the geometric structure of the solution space. As a promising alternative solution, Riemannian-based DL uses geometric optimization to update parameters on Riemannian manifolds and can leverage the underlying geometric information. Accordingly, this article presents a comprehensive survey of applying geometric optimization in DL. At first, this article introduces the basic procedure of the geometric optimization, including various geometric optimizers and some concepts of Riemannian manifold. Subsequently, this article investigates the application of geometric optimization in different DL networks in various AI tasks, e.g., convolution neural network, recurrent neural network, transfer learning, and optimal transport. Additionally, typical public toolboxes that implement optimization on manifold are also discussed. Finally, this article makes a performance comparison between different deep geometric optimization methods under image recognition scenarios.Comment: 41 page

arXiv.org e-Print Archive

DC Proximal Newton for Non-Convex Optimization Problems

Author: Flamary Remi
Gasso Gilles
Rakotomamonjy Alain
Publication venue
Publication date: 01/01/2015
Field of study

We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are non-convex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. The algorithm consists in obtaining a descent direction from an approximation of the loss function and then in performing a line search to ensure sufficient descent. A theoretical analysis is provided showing that the iterates of the proposed algorithm {admit} as limit points stationary points of the DC objective function. Numerical experiments show that our approach is more efficient than current state of the art for a problem with a convex loss functions and non-convex regularizer. We have also illustrated the benefit of our algorithm in high-dimensional transductive learning problem where both loss function and regularizers are non-convex

arXiv.org e-Print Archive

HAL - Normandie Université

Learning with Single View Co-training and Marginalized Dropout

Author: Chen Minmin
Publication venue: Washington University Open Scholarship
Publication date: 19/03/2013
Field of study

The generalization properties of most existing machine learning techniques are predicated on the assumptions that 1) a sufficiently large quantity of training data is available; 2) the training and testing data come from some common distribution. Although these assumptions are often met in practice, there are also many scenarios in which training data from the relevant distribution is insufficient. We focus on making use of additional data, which is readily available or can be obtained easily but comes from a different distribution than the testing data, to aid learning. We present five learning scenarios, depending on how the distribution we used to sample the additional training data differs from the testing distribution: 1) learning with weak supervision; 2) domain adaptation; 3) learning from multiple domains; 4) learning from corrupted data; 5) learning with partial supervision. We introduce two strategies and manifest them in five ways to cope with the difference between the training and testing distribution. The first strategy, which gives rise to Pseudo Multi-view Co-training: PMC) and Co-training for Domain Adaptation: CODA), is inspired by the co-training algorithm for multi-view data. PMC generalizes co-training to the more common single view data and allows us to learn from weakly labeled data retrieved free from the web. CODA integrates PMC with an another feature selection component to address the feature incompatibility between domains for domain adaptation. PMC and CODA are evaluated on a variety of real datasets, and both yield record performance. The second strategy marginalized dropout leads to marginalized Stacked Denoising Autoencoders: mSDA), Marginalized Corrupted Features: MCF) and FastTag: FastTag). mSDA diminishes the difference between distributions associated with different domains by learning a new representation through marginalized corruption and reconstruciton. MCF learns from a known distribution which is created by corrupting a small set of training data, and improves robustness of learned classifiers by training on ``infinitely\u27\u27 many data sampled from the distribution. FastTag applies marginalized dropout to the output of partially labeled data to recover missing labels for multi-label tasks. These three algorithms not only achieve the state-of-art performance in various tasks, but also deliver orders of magnitude speed up at training and testing comparing to competing algorithms

Washington University St. Louis: Open Scholarship

A flexible framework for cubic regularization algorithms for non-convex optimization in function space

Author: Schiela Anton
Publication venue
Publication date: 03/11/2017
Field of study

EPub Bayreuth

Efficient optimization methods for regularized learning: support vector machines and total-variation regularization

Author: Barbero Jiménez Álvaro
Publication venue
Publication date: 01/01/2011
Field of study

Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, mayo de 201

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo