56 research outputs found

    Convolutional Dictionary Learning: Acceleration and Convergence

    Full text link
    Convolutional dictionary learning (CDL or sparsifying CDL) has many applications in image processing and computer vision. There has been growing interest in developing efficient algorithms for CDL, mostly relying on the augmented Lagrangian (AL) method or the variant alternating direction method of multipliers (ADMM). When their parameters are properly tuned, AL methods have shown fast convergence in CDL. However, the parameter tuning process is not trivial due to its data dependence and, in practice, the convergence of AL methods depends on the AL parameters for nonconvex CDL problems. To moderate these problems, this paper proposes a new practically feasible and convergent Block Proximal Gradient method using a Majorizer (BPG-M) for CDL. The BPG-M-based CDL is investigated with different block updating schemes and majorization matrix designs, and further accelerated by incorporating some momentum coefficient formulas and restarting techniques. All of the methods investigated incorporate a boundary artifacts removal (or, more generally, sampling) operator in the learning model. Numerical experiments show that, without needing any parameter tuning process, the proposed BPG-M approach converges more stably to desirable solutions of lower objective values than the existing state-of-the-art ADMM algorithm and its memory-efficient variant do. Compared to the ADMM approaches, the BPG-M method using a multi-block updating scheme is particularly useful in single-threaded CDL algorithm handling large datasets, due to its lower memory requirement and no polynomial computational complexity. Image denoising experiments show that, for relatively strong additive white Gaussian noise, the filters learned by BPG-M-based CDL outperform those trained by the ADMM approach.Comment: 21 pages, 7 figures, submitted to IEEE Transactions on Image Processin

    Apprentissage Ă  grande Ă©chelle et applications

    Get PDF
    This thesis presents my main research activities in statistical machine learning aftermy PhD, starting from my post-doc at UC Berkeley to my present research position atInria Grenoble. The first chapter introduces the context and a summary of my scientificcontributions and emphasizes the importance of pluri-disciplinary research. For instance,mathematical optimization has become central in machine learning and the interplay betweensignal processing, statistics, bioinformatics, and computer vision is stronger thanever. With many scientific and industrial fields producing massive amounts of data, theimpact of machine learning is potentially huge and diverse. However, dealing with massivedata raises also many challenges. In this context, the manuscript presents differentcontributions, which are organized in three main topics.Chapter 2 is devoted to large-scale optimization in machine learning with a focus onalgorithmic methods. We start with majorization-minimization algorithms for structuredproblems, including block-coordinate, incremental, and stochastic variants. These algorithmsare analyzed in terms of convergence rates for convex problems and in terms ofconvergence to stationary points for non-convex ones. We also introduce fast schemesfor minimizing large sums of convex functions and principles to accelerate gradient-basedapproaches, based on Nesterov’s acceleration and on Quasi-Newton approaches.Chapter 3 presents the paradigm of deep kernel machine, which is an alliance betweenkernel methods and multilayer neural networks. In the context of visual recognition, weintroduce a new invariant image model called convolutional kernel networks, which is anew type of convolutional neural network with a reproducing kernel interpretation. Thenetwork comes with simple and effective principles to do unsupervised learning, and iscompatible with supervised learning via backpropagation rules.Chapter 4 is devoted to sparse estimation—that is, the automatic selection of modelvariables for explaining observed data; in particular, this chapter presents the result ofpluri-disciplinary collaborations in bioinformatics and neuroscience where the sparsityprinciple is a key to build intepretable predictive models.Finally, the last chapter concludes the manuscript and suggests future perspectives.Ce mĂ©moire prĂ©sente mes activitĂ©s de recherche en apprentissage statistique aprĂšs mathĂšse de doctorat, dans une pĂ©riode allant de mon post-doctorat Ă  UC Berkeley jusqu’àmon activitĂ© actuelle de chercheur chez Inria. Le premier chapitre fournit un contextescientifique dans lequel s’inscrivent mes travaux et un rĂ©sumĂ© de mes contributions, enmettant l’accent sur l’importance de la recherche pluri-disciplinaire. L’optimisation mathĂ©matiqueest ainsi devenue un outil central en apprentissage statistique et les interactionsavec les communautĂ©s de vision artificielle, traitement du signal et bio-informatiquen’ont jamais Ă©tĂ© aussi fortes. De nombreux domaines scientifiques et industriels produisentdes donnĂ©es massives, mais les traiter efficacement nĂ©cessite de lever de nombreux verrousscientifiques. Dans ce contexte, ce mĂ©moire prĂ©sente diffĂ©rentes contributions, qui sontorganisĂ©es en trois thĂ©matiques.Le chapitre 2 est dĂ©diĂ© Ă  l’optimisation Ă  large Ă©chelle en apprentissage statistique.Dans un premier lieu, nous Ă©tudions plusieurs variantes d’algorithmes de majoration/minimisationpour des problĂšmes structurĂ©s, telles que des variantes par bloc de variables,incrĂ©mentales, et stochastiques. Chaque algorithme est analysĂ© en terme de taux deconvergence lorsque le problĂšme est convexe, et nous montrons la convergence de ceux-civers des points stationnaires dans le cas contraire. Des mĂ©thodes de minimisation rapidespour traiter le cas de sommes finies de fonctions sont aussi introduites, ainsi que desalgorithmes d’accĂ©lĂ©ration pour les techniques d’optimisation de premier ordre.Le chapitre 3 prĂ©sente le paradigme des mĂ©thodes Ă  noyaux profonds, que l’on peutinterprĂ©ter comme un mariage entre les mĂ©thodes Ă  noyaux classiques et les techniquesd’apprentissage profond. Dans le contexte de la reconnaissance visuelle, ce chapitre introduitun nouveau modĂšle d’image invariant appelĂ© rĂ©seau convolutionnel Ă  noyaux, qui estun nouveau type de rĂ©seau de neurones convolutionnel avec une interprĂ©tation en termesde noyaux reproduisants. Le rĂ©seau peut ĂȘtre appris simplement sans supervision grĂąceĂ  des techniques classiques d’approximation de noyaux, mais est aussi compatible avecl’apprentissage supervisĂ© grĂące Ă  des rĂšgles de backpropagation.Le chapitre 4 est dĂ©diĂ© Ă  l’estimation parcimonieuse, c’est Ă  dire, Ă  la sĂ©lĂ©ction automatiquede variables permettant d’expliquer des donnĂ©es observĂ©es. En particulier, cechapitre dĂ©crit des collaborations pluri-disciplinaires en bioinformatique et neuroscience,oĂč le principe de parcimonie est crucial pour obtenir des modĂšles prĂ©dictifs interprĂ©tables.Enfin, le dernier chapitre conclut ce mĂ©moire et prĂ©sente des perspectives futures

    Non-convex regularization in remote sensing

    Get PDF
    In this paper, we study the effect of different regularizers and their implications in high dimensional image classification and sparse linear unmixing. Although kernelization or sparse methods are globally accepted solutions for processing data in high dimensions, we present here a study on the impact of the form of regularization used and its parametrization. We consider regularization via traditional squared (2) and sparsity-promoting (1) norms, as well as more unconventional nonconvex regularizers (p and Log Sum Penalty). We compare their properties and advantages on several classification and linear unmixing tasks and provide advices on the choice of the best regularizer for the problem at hand. Finally, we also provide a fully functional toolbox for the community.Comment: 11 pages, 11 figure

    Deep Structured Layers for Instance-Level Optimization in 2D and 3D Vision

    Get PDF
    The approach we present in this thesis is that of integrating optimization problems as layers in deep neural networks. Optimization-based modeling provides an additional set of tools enabling the design of powerful neural networks for a wide battery of computer vision tasks. This thesis shows formulations and experiments for vision tasks ranging from image reconstruction to 3D reconstruction. We first propose an unrolled optimization method with implicit regularization properties for reconstructing images from noisy camera readings. The method resembles an unrolled majorization minimization framework with convolutional neural networks acting as regularizers. We report state-of-the-art performance in image reconstruction on both noisy and noise-free evaluation setups across many datasets. We further focus on the task of monocular 3D reconstruction of articulated objects using video self-supervision. The proposed method uses a structured layer for accurate object deformation that controls a 3D surface by displacing a small number of learnable handles. While relying on a small set of training data per category for self-supervision, the method obtains state-of-the-art reconstruction accuracy with diverse shapes and viewpoints for multiple articulated objects. We finally address the shortcomings of the previous method that revolve around regressing the camera pose using multiple hypotheses. We propose a method that recovers a 3D shape from a 2D image by relying solely on 3D-2D correspondences regressed from a convolutional neural network. These correspondences are used in conjunction with an optimization problem to estimate per sample the camera pose and deformation. We quantitatively show the effectiveness of the proposed method on self-supervised 3D reconstruction on multiple categories without the need for multiple hypotheses
    • 

    corecore