56 research outputs found

    Proximal methods for structured group features and correlation matrix nearness

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingeniería Informática. Fecha de lectura: junio de 2014Optimization is ubiquitous in real life as many of the strategies followed both by nature and by humans aim to minimize a certain cost, or maximize a certain benefit. More specifically, numerous strategies in engineering are designed according to a minimization problem, although usually the problems tackled are convex with a di erentiable objective function, since these problems have no local minima and they can be solved with gradient-based techniques. Nevertheless, many interesting problems are not di erentiable, such as, for instance, projection problems or problems based on non-smooth norms. An approach to deal with them can be found in the theory of Proximal Methods (PMs), which are based on iterative local minimizations using the Proximity Operator (ProxOp) of the terms that compose the objective function. This thesis begins with a general introduction and a brief motivation of the work done. The state of the art in PMs is thoroughly reviewed, defining the basic concepts from the very beginning and describing the main algorithms, as far as possible, in a simple and self-contained way. After that, the PMs are employed in the field of supervised regression, where regularized models play a prominent role. In particular, some classical linear sparse models are reviewed and unified under the point of view of regularization, namely the Lasso, the Elastic–Network, the Group Lasso and the Group Elastic–Network. All these models are trained by minimizing an error term plus a regularization term, and thus they fit nicely in the domain of PMs, as the structure of the problem can be exploited by minimizing alternatively the di erent expressions that compose the objective function, in particular using the Fast Iterative Shrinkage–Thresholding Algorithm (FISTA). As a real-world application, it is shown how these models can be used to forecast wind energy, where they yield both good predictions in terms of the error and, more importantly, valuable information about the structure and distribution of the relevant features. Following with the regularized learning approach, a new regularizer is proposed, called the Group Total Variation, which is a group extension of the classical Total Variation regularizer and thus it imposes constancy over groups of features. In order to deal with it, an approach to compute its ProxOp is derived. Moreover, it is shown that this regularizer can be used directly to clean noisy multidimensional signals (such as colour images) or to define a new linear model, the Group Fused Lasso (GFL), which can be then trained using FISTA. It is also exemplified how this model, when applied to regression problems, is able to provide solutions that identify the underlying problem structure. As an additional result of this thesis, a public software implementation of the GFL model is provided. The PMs are also applied to the Nearest Correlation Matrix problem under observation uncertainty. The original problem consists in finding the correlation matrix which is nearest to the true empirical one. Some variants introduce weights to adapt the confidence given to each entry of the matrix; with a more general perspective, in this thesis the problem is explored directly considering uncertainty on the observations, which is formalized as a set of intervals where the measured matrices lie. Two di erent variants are defined under this framework: a robust approach called the Robust Nearest Correlation Matrix (which aims to minimize the worst-case scenario) and an exploratory approach, the Exploratory Nearest Correlation Matrix (which focuses on the best-case scenario). It is shown how both optimization problems can be solved using the Douglas–Rachford PM with a suitable splitting of the objective functions. The thesis ends with a brief overall discussion and pointers to further work.La optimización está presente en todas las facetas de la vida, de hecho muchas de las estrategias tanto de la naturaleza como del ser humano pretenden minimizar un cierto coste, o maximizar un cierto beneficio. En concreto, multitud de estrategias en ingeniería se diseñan según problemas de minimización, que habitualmente son problemas convexos con una función objetivo diferenciable, puesto que en ese caso no hay mínimos locales y los problemas pueden resolverse mediante técnicas basadas en gradiente. Sin embargo, hay muchos problemas interesantes que no son diferenciables, como por ejemplo problemas de proyección o basados en normas no suaves. Una aproximación para abordar estos problemas son los Métodos Proximales (PMs), que se basan en minimizaciones locales iterativas utilizando el Operador de Proximidad (ProxOp) de los términos de la función objetivo. La tesis comienza con una introducción general y una breve motivación del trabajo hecho. Se revisa en profundidad el estado del arte en PMs, definiendo los conceptos básicos y describiendo los algoritmos principales, dentro de lo posible, de forma simple y auto-contenida. Tras ello, se emplean los PMs en el campo de la regresión supervisada, donde los modelos regularizados tienen un papel prominente. En particular, se revisan y unifican bajo esta perspectiva de regularización algunos modelos lineales dispersos clásicos, a saber, Lasso, Elastic–Network, Lasso Grupal y Elastic–Network Grupal. Todos estos modelos se entrenan minimizando un término de error y uno de regularización, y por tanto encajan perfectamente en el dominio de los PMs, ya que la estructura del problema puede ser aprovechada minimizando alternativamente las diferentes expresiones que componen la función objetivo, en particular mediante el Algoritmo Fast Iterative Shrinkage–Thresholding (FISTA). Como aplicación al mundo real, se muestra que estos modelos pueden utilizarse para predecir energía eólica, donde proporcionan tanto buenos resultados en términos del error como información valiosa sobre la estructura y distribución de las características relevantes. Siguiendo con esta aproximación, se propone un nuevo regularizador, llamado Variación Total Grupal, que es una extensión grupal del regularizador clásico de Variación Total y que por tanto induce constancia sobre grupos de características. Para aplicarlo, se desarrolla una aproximación para calcular su ProxOp. Además, se muestra que este regularizador puede utilizarse directamente para limpiar señales multidimensionales ruidosas (como imágenes a color) o para definir un nuevo modelo lineal, el Fused Lasso Grupal (GFL), que se entrena con FISTA. Se ilustra cómo este modelo, cuando se aplica a problemas de regresión, es capaz de proporcionar soluciones que identifican la estructura subyacente del problema. Como resultado adicional de esta tesis, se publica una implementación software del modelo GFL. Asimismo, se aplican los PMs al problema de Matriz de Correlación Próxima (NCM) bajo incertidumbre. El problema original consiste en encontrar la matriz de correlación más cercana a la empírica verdadera. Algunas variantes introducen pesos para ajustar la confianza que se da a cada entrada de la matriz; con un carácter más general, en esta tesis se explora el problema considerando incertidumbre en las observaciones, que se formaliza como un conjunto de intervalos en el que se encuentran las matrices medidas. Bajo este marco se definen dos variantes: una aproximación robusta llamada NCM Robusta (que minimiza el caso peor) y una exploratoria, NCM Exploratoria (que se centra en el caso mejor). Ambos problemas de optimización pueden resolverse con el PM de Douglas–Rachford y una partición adecuada de las funciones objetivo. La tesis concluye con una discusión global y referencias a trabajo futur

    Numerical splitting methods for nonsmooth convex optimization problems

    Get PDF
    In this thesis, we develop and investigate numerical methods for solving nonsmooth convex optimization problems in real Hilbert spaces. We construct algorithms, such that they handle the terms in the objective function and constraints of the minimization problems separately, which makes these methods simpler to compute. In the first part of the thesis, we extend the well known AMA method from Tseng to the Proximal AMA algorithm by introducing variable metrics in the subproblems of the primal-dual algorithm. For a special choice of metrics, the subproblems become proximal steps. Thus, for objectives in a lot of important applications, such as signal and image processing, machine learning or statistics, the iteration process consists of expressions in closed form that are easy to calculate. In the further course of the thesis, we intensify the investigation on this algorithm by considering and studying a dynamical system. Through explicit time discretization of this system, we obtain Proximal AMA. We show the existence and uniqueness of strong global solutions of the dynamical system and prove that its trajectories converge to the primal-dual solution of the considered optimization problem. In the last part of this thesis, we minimize a sum of finitely many nonsmooth convex functions (each can be composed by a linear operator) over a nonempty, closed and convex set by smoothing these functions. We consider a stochastic algorithm in which we take gradient steps of the smoothed functions (which are proximal steps if we smooth by Moreau envelope), and use a mirror map to 'mirror'' the iterates onto the feasible set. In applications, we compare them to similar methods and discuss the advantages and practical usability of these new algorithms

    Reconstruction de phase et de signaux audio avec des fonctions de coût non-quadratiques

    Get PDF
    Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction

    Reconstruction using local sparsity:a novel regularization technique and an asymptotic analysis of spatial sparsity priors

    Full text link
    Das Gebiet der inversen Probleme, wobei die Unbekannte neben ihrer örtlichen Dimension mindestens noch eine zusätzliche Dimension enthält, ist bedeutend für viele Anwendungen wie z.B. Bildgebung, Naturwissenschaften und Medizin. Es hat sich durchgesetzt dünnbesetzte Matrizen (Sparsity) als Lösungen zu fördern. Diese Arbeit beschäftigt sich mit einer speziellen Art der Dünnbesetztheit, welche eine gewisse Struktur in der Lösungsmatrix favorisiert. Wir präsentieren und analysieren eine neue Regularisierung, welche lokale Dünnbesetztheit fördert, indem die l^{1,inf}-Norm in einem Variationsansatz minimiert wird. Zusätzlich analysieren wir die Asymptotik verschiedener Regularisierungsfunktionale, welche Dünngesetztheit fördern. Wir betrachten diskrete Funktionale, welche dünnbesetzte Lösungen bevorzugen, und analysieren ihr Verhalten für feiner werdende Diskretisierungen. Hierbei erhalten wir einige Gamma-Grenzwerte. Wir betrachten nicht nur l^p-Normen für p ≥ 1 sondern auch die l^0-“Norm”.The specific field of inverse problems, where the unknown obtains apart from its spatial dimensions at least one additional dimension, is of major interest for many applications in imaging, natural sciences and medicine. Enforcing certain sparsity priors on such unknowns, which can be written as a matrix, has thus become current state of research. This thesis deals with a special type of sparsity prior, which enforces a certain structure on the unknown matrix. We present and analyze a novel regularization technique promoting so-called local sparsity by minimizing the l^{1,inf}-norm as a regularization functional in a variational approach. Furthermore, we theoretically analyze the asymptotics of spatial sparsity priors. We consider discrete sparsity promoting functionals and analyze their behavior as the discretization becomes finer. In so doing, we are able to compute some gamma-limits. We not only consider usual l^p-norms for p ≥ 1, but also analyze the asymptotics of the l^0-“norm”

    Part-based recognition of 3-D objects with application to shape modeling in hearing aid manufacturing

    Get PDF
    In order to meet the needs of people with hearing loss today hearing aids are custom designed. Increasingly accurate 3-D scanning technology has contributed to the transition from conventional production scenarios to software based processes. Nonetheless, there is a tremendous amount of manual work involved to transform an input 3-D surface mesh of the outer ear into a final hearing aid shape. This manual work is often cumbersome and requires lots of experience which is why automatic solutions are of high practical relevance. This work is concerned with the recognition of 3-D surface meshes of ear implants. In particular we present a semantic part-labeling framework which significantly outperforms existing approaches for this task. We make at least three contributions which may also be found useful for other classes of 3-D meshes. Firstly, we validate the discriminative performance of several local descriptors and show that the majority of them performs poorly on our data except for 3-D shape contexts. The reason for this is that many local descriptor schemas are not rich enough to capture subtle variations in form of bends which is typical for organic shapes. Secondly, based on the observation that the left and the right outer ear of an individual look very similar we raised the question how similar the ear shapes among arbitrary individuals are? In this work, we define a notion of distance between ear shapes as building block of a non-parametric shape model of the ear to better handle the anatomical variability in ear implant labeling. Thirdly, we introduce a conditional random field model with a variety of label priors to facilitate the semantic part-labeling of 3-D meshes of ear implants. In particular we introduce the concept of a global parametric transition prior to enforce transition boundaries between adjacent object parts with an a priori known parametric form. In this way we were able to overcome the issue of inadequate geometric cues (e.g., ridges, bumps, concavities) as natural indicators for the presence of part boundaries. The last part of this work offers an outlook to possible extensions of our methods, in particular the development of 3-D descriptors that are fast to compute whilst at the same time rich enough to capture the characteristic differences between objects residing in the same class

    Elastic plastic damage laws for cortical bone

    Get PDF
    Motivated by applications in orthopaedic and maxillo-facial surgery, the mechanical behaviour of cortical bone tissue in cyclic overloads at physiological strain rates is investigated. The emphasis is on the development of appropriate constitutive laws that faithfully reproduce the loading, unloading, and reloading sequence observed during experimental in vitro uniaxial testing. To this end, the models include three distinct modes of evolution, namely a linear elastic mode due to bone cohesion, a damage mode where microcracks are generated and a plastic mode corresponding to sliding at the microcracks. The proposed models use the internal state variable approach common in continuum damage mechanics and allow a straightforward interpretation of the constitutive behaviour of cortical bone. They are derived within the generalized standard materials formalism and are thus thermodynamically consistent. The mathematical formulation of the models is based on the definition of two internal state variables: a damage variable that represents the microcrack density reducing the tissue stiffness, and a plastic strain variable representing the deformation associated with these microcracks. Firstly, two one-dimensional models describing the uniaxial quasistatic behaviour of cortical bone are developed. The first one includes a single scalar damage variable, whereas the second one is based on tensile and compressive damage variables, which improves the simulation results. Both models are then extended into rate-dependent alternatives by relating the rate of damage accumulation to some high power of the damage threshold stress. All four models consider different tensile and compressive damage threshold stresses as it is the case for cortical bone. Secondly, the material constants characterizing the one-dimensional models are identified on experimental grounds. To this end, a series of in vitro uniaxial overloading tests were carried out on bovine cortical bone. Reliable measurements were obtained in tension using dumbbell specimens, avoiding thus undesirable boundary effects. Thirdly, a three-dimensional rate-independent constitutive law inspired by the one-dimensional models is formulated and implemented in a finite element code. It includes porous fabric-based orthotropic elasticity and rate-independent plasticity with damage. The onset of damage is characterized by an orthotropic stress-based damage criterion described by porosity and fabric, which takes into account distinct tensile and compressive damage threshold stresses. Finally, the potential of the new three-dimensional elastic plastic damage constitutive law for cortical bone is demonstrated by means of a finite element analysis of the compression of a vertebra

    Learning with Structured Sparsity: From Discrete to Convex and Back.

    Get PDF
    In modern-data analysis applications, the abundance of data makes extracting meaningful information from it challenging, in terms of computation, storage, and interpretability. In this setting, exploiting sparsity in data has been essential to the development of scalable methods to problems in machine learning, statistics and signal processing. However, in various applications, the input variables exhibit structure beyond simple sparsity. This motivated the introduction of structured sparsity models, which capture such sophisticated structures, leading to a significant performance gains and better interpretability. Structured sparse approaches have been successfully applied in a variety of domains including computer vision, text processing, medical imaging, and bioinformatics. The goal of this thesis is to improve on these methods and expand their success to a wider range of applications. We thus develop novel methods to incorporate general structure a priori in learning problems, which balance computational and statistical efficiency trade-offs. To achieve this, our results bring together tools from the rich areas of discrete and convex optimization. Applying structured sparsity approaches in general is challenging because structures encountered in practice are naturally combinatorial. An effective approach to circumvent this computational challenge is to employ continuous convex relaxations. We thus start by introducing a new class of structured sparsity models, able to capture a large range of structures, which admit tight convex relaxations amenable to efficient optimization. We then present an in-depth study of the geometric and statistical properties of convex relaxations of general combinatorial structures. In particular, we characterize which structure is lost by imposing convexity and which is preserved. We then focus on the optimization of the convex composite problems that result from the convex relaxations of structured sparsity models. We develop efficient algorithmic tools to solve these problems in a non-Euclidean setting, leading to faster convergence in some cases. Finally, to handle structures that do not admit meaningful convex relaxations, we propose to use, as a heuristic, a non-convex proximal gradient method, efficient for several classes of structured sparsity models. We further extend this method to address a probabilistic structured sparsity model, we introduce to model approximately sparse signals

    Learning to Rank: Online Learning, Statistical Theory and Applications.

    Full text link
    Learning to rank is a supervised machine learning problem, where the output space is the special structured space of emph{permutations}. Learning to rank has diverse application areas, spanning information retrieval, recommendation systems, computational biology and others. In this dissertation, we make contributions to some of the exciting directions of research in learning to rank. In the first part, we extend the classic, online perceptron algorithm for classification to learning to rank, giving a loss bound which is reminiscent of Novikoff's famous convergence theorem for classification. In the second part, we give strategies for learning ranking functions in an online setting, with a novel, feedback model, where feedback is restricted to labels of top ranked items. The second part of our work is divided into two sub-parts; one without side information and one with side information. In the third part, we provide novel generalization error bounds for algorithms applied to various Lipschitz and/or smooth ranking surrogates. In the last part, we apply ranking losses to learn policies for personalized advertisement recommendations, partially overcoming the problem of click sparsity. We conduct experiments on various simulated and commercial datasets, comparing our strategies with baseline strategies for online learning to rank and personalized advertisement recommendation.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133334/1/sougata_1.pd
    corecore