56 research outputs found
Convolutional Dictionary Learning: Acceleration and Convergence
Convolutional dictionary learning (CDL or sparsifying CDL) has many
applications in image processing and computer vision. There has been growing
interest in developing efficient algorithms for CDL, mostly relying on the
augmented Lagrangian (AL) method or the variant alternating direction method of
multipliers (ADMM). When their parameters are properly tuned, AL methods have
shown fast convergence in CDL. However, the parameter tuning process is not
trivial due to its data dependence and, in practice, the convergence of AL
methods depends on the AL parameters for nonconvex CDL problems. To moderate
these problems, this paper proposes a new practically feasible and convergent
Block Proximal Gradient method using a Majorizer (BPG-M) for CDL. The
BPG-M-based CDL is investigated with different block updating schemes and
majorization matrix designs, and further accelerated by incorporating some
momentum coefficient formulas and restarting techniques. All of the methods
investigated incorporate a boundary artifacts removal (or, more generally,
sampling) operator in the learning model. Numerical experiments show that,
without needing any parameter tuning process, the proposed BPG-M approach
converges more stably to desirable solutions of lower objective values than the
existing state-of-the-art ADMM algorithm and its memory-efficient variant do.
Compared to the ADMM approaches, the BPG-M method using a multi-block updating
scheme is particularly useful in single-threaded CDL algorithm handling large
datasets, due to its lower memory requirement and no polynomial computational
complexity. Image denoising experiments show that, for relatively strong
additive white Gaussian noise, the filters learned by BPG-M-based CDL
outperform those trained by the ADMM approach.Comment: 21 pages, 7 figures, submitted to IEEE Transactions on Image
Processin
Apprentissage Ă grande Ă©chelle et applications
This thesis presents my main research activities in statistical machine learning aftermy PhD, starting from my post-doc at UC Berkeley to my present research position atInria Grenoble. The first chapter introduces the context and a summary of my scientificcontributions and emphasizes the importance of pluri-disciplinary research. For instance,mathematical optimization has become central in machine learning and the interplay betweensignal processing, statistics, bioinformatics, and computer vision is stronger thanever. With many scientific and industrial fields producing massive amounts of data, theimpact of machine learning is potentially huge and diverse. However, dealing with massivedata raises also many challenges. In this context, the manuscript presents differentcontributions, which are organized in three main topics.Chapter 2 is devoted to large-scale optimization in machine learning with a focus onalgorithmic methods. We start with majorization-minimization algorithms for structuredproblems, including block-coordinate, incremental, and stochastic variants. These algorithmsare analyzed in terms of convergence rates for convex problems and in terms ofconvergence to stationary points for non-convex ones. We also introduce fast schemesfor minimizing large sums of convex functions and principles to accelerate gradient-basedapproaches, based on Nesterovâs acceleration and on Quasi-Newton approaches.Chapter 3 presents the paradigm of deep kernel machine, which is an alliance betweenkernel methods and multilayer neural networks. In the context of visual recognition, weintroduce a new invariant image model called convolutional kernel networks, which is anew type of convolutional neural network with a reproducing kernel interpretation. Thenetwork comes with simple and effective principles to do unsupervised learning, and iscompatible with supervised learning via backpropagation rules.Chapter 4 is devoted to sparse estimationâthat is, the automatic selection of modelvariables for explaining observed data; in particular, this chapter presents the result ofpluri-disciplinary collaborations in bioinformatics and neuroscience where the sparsityprinciple is a key to build intepretable predictive models.Finally, the last chapter concludes the manuscript and suggests future perspectives.Ce mĂ©moire prĂ©sente mes activitĂ©s de recherche en apprentissage statistique aprĂšs mathĂšse de doctorat, dans une pĂ©riode allant de mon post-doctorat Ă UC Berkeley jusquâĂ mon activitĂ© actuelle de chercheur chez Inria. Le premier chapitre fournit un contextescientifique dans lequel sâinscrivent mes travaux et un rĂ©sumĂ© de mes contributions, enmettant lâaccent sur lâimportance de la recherche pluri-disciplinaire. Lâoptimisation mathĂ©matiqueest ainsi devenue un outil central en apprentissage statistique et les interactionsavec les communautĂ©s de vision artificielle, traitement du signal et bio-informatiquenâont jamais Ă©tĂ© aussi fortes. De nombreux domaines scientifiques et industriels produisentdes donnĂ©es massives, mais les traiter efficacement nĂ©cessite de lever de nombreux verrousscientifiques. Dans ce contexte, ce mĂ©moire prĂ©sente diffĂ©rentes contributions, qui sontorganisĂ©es en trois thĂ©matiques.Le chapitre 2 est dĂ©diĂ© Ă lâoptimisation Ă large Ă©chelle en apprentissage statistique.Dans un premier lieu, nous Ă©tudions plusieurs variantes dâalgorithmes de majoration/minimisationpour des problĂšmes structurĂ©s, telles que des variantes par bloc de variables,incrĂ©mentales, et stochastiques. Chaque algorithme est analysĂ© en terme de taux deconvergence lorsque le problĂšme est convexe, et nous montrons la convergence de ceux-civers des points stationnaires dans le cas contraire. Des mĂ©thodes de minimisation rapidespour traiter le cas de sommes finies de fonctions sont aussi introduites, ainsi que desalgorithmes dâaccĂ©lĂ©ration pour les techniques dâoptimisation de premier ordre.Le chapitre 3 prĂ©sente le paradigme des mĂ©thodes Ă noyaux profonds, que lâon peutinterprĂ©ter comme un mariage entre les mĂ©thodes Ă noyaux classiques et les techniquesdâapprentissage profond. Dans le contexte de la reconnaissance visuelle, ce chapitre introduitun nouveau modĂšle dâimage invariant appelĂ© rĂ©seau convolutionnel Ă noyaux, qui estun nouveau type de rĂ©seau de neurones convolutionnel avec une interprĂ©tation en termesde noyaux reproduisants. Le rĂ©seau peut ĂȘtre appris simplement sans supervision grĂąceĂ des techniques classiques dâapproximation de noyaux, mais est aussi compatible aveclâapprentissage supervisĂ© grĂące Ă des rĂšgles de backpropagation.Le chapitre 4 est dĂ©diĂ© Ă lâestimation parcimonieuse, câest Ă dire, Ă la sĂ©lĂ©ction automatiquede variables permettant dâexpliquer des donnĂ©es observĂ©es. En particulier, cechapitre dĂ©crit des collaborations pluri-disciplinaires en bioinformatique et neuroscience,oĂč le principe de parcimonie est crucial pour obtenir des modĂšles prĂ©dictifs interprĂ©tables.Enfin, le dernier chapitre conclut ce mĂ©moire et prĂ©sente des perspectives futures
Non-convex regularization in remote sensing
In this paper, we study the effect of different regularizers and their
implications in high dimensional image classification and sparse linear
unmixing. Although kernelization or sparse methods are globally accepted
solutions for processing data in high dimensions, we present here a study on
the impact of the form of regularization used and its parametrization. We
consider regularization via traditional squared (2) and sparsity-promoting (1)
norms, as well as more unconventional nonconvex regularizers (p and Log Sum
Penalty). We compare their properties and advantages on several classification
and linear unmixing tasks and provide advices on the choice of the best
regularizer for the problem at hand. Finally, we also provide a fully
functional toolbox for the community.Comment: 11 pages, 11 figure
Deep Structured Layers for Instance-Level Optimization in 2D and 3D Vision
The approach we present in this thesis is that of integrating optimization problems
as layers in deep neural networks. Optimization-based modeling provides an additional set of tools enabling the design of powerful neural networks for a wide
battery of computer vision tasks. This thesis shows formulations and experiments
for vision tasks ranging from image reconstruction to 3D reconstruction.
We first propose an unrolled optimization method with implicit regularization
properties for reconstructing images from noisy camera readings. The method resembles an unrolled majorization minimization framework with convolutional neural networks acting as regularizers. We report state-of-the-art performance in image
reconstruction on both noisy and noise-free evaluation setups across many datasets.
We further focus on the task of monocular 3D reconstruction of articulated objects using video self-supervision. The proposed method uses a structured layer for
accurate object deformation that controls a 3D surface by displacing a small number
of learnable handles. While relying on a small set of training data per category for
self-supervision, the method obtains state-of-the-art reconstruction accuracy with
diverse shapes and viewpoints for multiple articulated objects.
We finally address the shortcomings of the previous method that revolve
around regressing the camera pose using multiple hypotheses. We propose a method
that recovers a 3D shape from a 2D image by relying solely on 3D-2D correspondences regressed from a convolutional neural network. These correspondences are
used in conjunction with an optimization problem to estimate per sample the camera pose and deformation. We quantitatively show the effectiveness of the proposed
method on self-supervised 3D reconstruction on multiple categories without the need for multiple hypotheses
Recommended from our members
Sparse Recovery and Representation Learning
This dissertation focuses on sparse representation and dictionary learning, with three relative topics. First, in chapter 1, we study the problem of low-rank matrix recovery in the presence of prior information. We first study the recovery of low-rank matrices with a necessary and sufficient condition, called the Null Space Property, for exact recovery from compressively sampled measurements using nuclear norm minimization. Here, we provide an alternative theoretical analysis of the bound on the number of random Gaussian measurements needed for the condition to be satisfied with high probability. We then study low-rank matrix recovery when prior information is available. We analyze an existing algorithm, provide the necessary and sufficient conditions for exact recovery and show that the existing algorithm is limited in certain cases. We provide an alternative recovery algorithm to deal with the drawback and provide sufficient recovery conditions based on that. In chapter 2, we study the problem of learning a sparsifying dictionary of a set of data, focusing on learning dictionaries that admit fast transforms. Inspired by the Fast Fourier Transform, we propose a learning algorithm involving unknown parameters for a linear transformation matrix. Empirically, our algorithm can produce dictionaries that provide lower numerical sparsity for the sparse representation of images than the Discrete Fourier Transformation (DFT). Additionally, due to its structure, the learned dictionary can recover the original signal from the sparse representation in computations. In chapter 3, we study the representation learning problem in a more complex setting. We use the concept of dictionary learning and apply it in a deep generative model. Motivated by an application in the computer gaming industry where designers needs to have an urban layout generation tool that allows fast generation and modification, we present a novel solution to synthesize high quality building placements using conditional generative latent optimization together with adversarial training. The capability of the proposed method is demonstrated in various examples. The inference is nearly in real time, thus it can assist designers to iterate their designs of virtual cities quickly
- âŠ