Search CORE

933 research outputs found

Certifying the absence of spurious local minima at infinity

Author: Josz Cédric
Li Xiaopeng
Publication venue
Publication date: 13/07/2023
Field of study

When searching for global optima of nonconvex unconstrained optimization problems, it is desirable that every local minimum be a global minimum. This property of having no spurious local minima is true in various problems of interest nowadays, including principal component analysis, matrix sensing, and linear neural networks. However, since these problems are non-coercive, they may yet have spurious local minima at infinity. The classical tools used to analyze the optimization landscape, namely the gradient and the Hessian, are incapable of detecting spurious local minima at infinity. In this paper, we identify conditions that certify the absence of spurious local minima at infinity, one of which is having bounded subgradient trajectories. We check that they hold in several applications of interest.Comment: 31 pages, 4 figure

arXiv.org e-Print Archive

Statistical Physics of Design

Author: Klishin Andrei
Publication venue
Publication date: 01/01/2020
Field of study

Modern life increasingly relies on complex products that perform a variety of functions. The key difficulty of creating such products lies not in the manufacturing process, but in the design process. However, design problems are typically driven by multiple contradictory objectives and different stakeholders, have no obvious stopping criteria, and frequently prevent construction of prototypes or experiments. Such ill-defined, or "wicked" problems cannot be "solved" in the traditional sense with optimization methods. Instead, modern design techniques are focused on generating knowledge about the alternative solutions in the design space. In order to facilitate such knowledge generation, in this dissertation I develop the "Systems Physics" framework that treats the emergent structures within the design space as physical objects that interact via quantifiable forces. Mathematically, Systems Physics is based on maximal entropy statistical mechanics, which allows both drawing conceptual analogies between design problems and collective phenomena and performing numerical calculations to gain quantitative understanding. Systems Physics operates via a Model-Compute-Learn loop, with each step refining our thinking of design problems. I demonstrate the capabilities of Systems Physics in two very distinct case studies: Naval Engineering and self-assembly. For the Naval Engineering case, I focus on an established problem of arranging shipboard systems within the available hull space. I demonstrate the essential trade-off between minimizing the routing cost and maximizing the design flexibility, which can lead to abrupt phase transitions. I show how the design space can break into several locally optimal architecture classes that have very different robustness to external couplings. I illustrate how the topology of the shipboard functional network enters a tight interplay with the spatial constraints on placement. For the self-assembly problem, I show that the topology of self-assembled structures can be reliably encoded in the properties of the building blocks so that the structure and the blocks can be jointly designed. The work presented here provides both conceptual and quantitative advancements. In order to properly port the language and the formalism of statistical mechanics to the design domain, I critically re-examine such foundational ideas as system-bath coupling, coarse graining, particle distinguishability, and direct and emergent interactions. I show that the design space can be packed into a special information structure, a tensor network, which allows seamless transition from graphical visualization to sophisticated numerical calculations. This dissertation provides the first quantitative treatment of the design problem that is not reduced to the narrow goals of mathematical optimization. Using statistical mechanics perspective allows me to move beyond the dichotomy of "forward" and "inverse" design and frame design as a knowledge generation process instead. Such framing opens the way to further studies of the design space structures and the time- and path-dependent phenomena in design. The present work also benefits from, and contributes to the philosophical interpretations of statistical mechanics developed by the soft matter community in the past 20 years. The discussion goes far beyond physics and engages with literature from materials science, naval engineering, optimization problems, design theory, network theory, and economic complexity.PHDPhysicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163133/1/aklishin_1.pd

Deep Blue Documents at the University of Michigan

Fitness Landscape Analysis of Feed-Forward Neural Networks

Author: Bosman Anna Sergeevna
Publication venue: 'University of Pretoria - Department of Philosophy'
Publication date: 01/01/2019
Field of study

Neural network training is a highly non-convex optimisation problem with poorly understood properties. Due to the inherent high dimensionality, neural network search spaces cannot be intuitively visualised, thus other means to establish search space properties have to be employed. Fitness landscape analysis encompasses a selection of techniques designed to estimate the properties of a search landscape associated with an optimisation problem. Applied to neural network training, fitness landscape analysis can be used to establish a link between the properties of the error landscape and various neural network hyperparameters. This study applies fitness landscape analysis to investigate the influence of the search space boundaries, regularisation parameters, loss functions, activation functions, and feed-forward neural network architectures on the properties of the resulting error landscape. A novel gradient-based sampling technique is proposed, together with a novel method to quantify and visualise stationary points and the associated basins of attraction in neural network error landscapes.Thesis (PhD)--University of Pretoria, 2019.NRFComputer SciencePhDUnrestricte

UPSpace at the University of Pretoria

Clustering of Cases from Di erent Subtypes of Breast Cancer Using a Hop eld Network Built from Multi-omic Data

Author: Calderón-Achío Olger Kitchion
Publication venue: 'Instituto Tecnologico de Costa Rica'
Publication date: 01/01/2018
Field of study

Tesis de Graduación (Maestría en Computación) Instituto Tecnológico de Costa Rica, Escuela de Computación, 2018Despite scienti c advances, breast cancer still constitutes a worldwide major cause of death among women. Given the great heterogeneity between cases, distinct classi cation schemes have emerged. The intrinsic molecular subtype classi cation (luminal A, luminal B, HER2- enriched and basal-like) accounts for the molecular characteristics and prognosis of tumors, which provides valuable input for taking optimal treatment actions. Also, recent advancements in molecular biology have provided scientists with high quality and diversity of omiclike data, opening up the possibility of creating computational models for improving and validating current subtyping systems. On this study, a Hop eld Network model for breast cancer subtyping and characterization was created using data from The Cancer Genome Atlas repository. Novel aspects include the usage of the network as a clustering mechanism and the integrated use of several molecular types of data (gene mRNA expression, miRNA expression and copy number variation). The results showed clustering capabilities for the network, but even so, trying to derive a biological model from a Hop eld Network might be di cult given the mirror attractor phenomena (every cluster might end up with an opposite). As a methodological aspect, Hop eld was compared with kmeans and OPTICS clustering algorithms. The last one, surprisingly, hints at the possibility of creating a high precision model that di erentiates between luminal, HER2-enriched and basal samples using only 10 genes. The normalization procedure of dividing gene expression values by their corresponding gene copy number appears to have contributed to the results. This opens up the possibility of exploring these kind of prediction models for implementing diagnostic tests at a lower cost

Repositorio Institucional del Instituto Tecnologico de Costa Rica