251 research outputs found

    From Symmetry to Geometry: Tractable Nonconvex Problems

    Full text link
    As science and engineering have become increasingly data-driven, the role of optimization has expanded to touch almost every stage of the data analysis pipeline, from the signal and data acquisition to modeling and prediction. The optimization problems encountered in practice are often nonconvex. While challenges vary from problem to problem, one common source of nonconvexity is nonlinearity in the data or measurement model. Nonlinear models often exhibit symmetries, creating complicated, nonconvex objective landscapes, with multiple equivalent solutions. Nevertheless, simple methods (e.g., gradient descent) often perform surprisingly well in practice. The goal of this survey is to highlight a class of tractable nonconvex problems, which can be understood through the lens of symmetries. These problems exhibit a characteristic geometric structure: local minimizers are symmetric copies of a single "ground truth" solution, while other critical points occur at balanced superpositions of symmetric copies of the ground truth, and exhibit negative curvature in directions that break the symmetry. This structure enables efficient methods to obtain global minimizers. We discuss examples of this phenomenon arising from a wide range of problems in imaging, signal processing, and data analysis. We highlight the key role of symmetry in shaping the objective landscape and discuss the different roles of rotational and discrete symmetries. This area is rich with observed phenomena and open problems; we close by highlighting directions for future research.Comment: review paper submitted to SIAM Review, 34 pages, 10 figure

    Relating Spontaneous Activity and Cognitive States via NeuroDynamic Modeling

    Get PDF
    Stimulus-free brain dynamics form the basis of current knowledge concerning functional integration and segregation within the human brain. These relationships are typically described in terms of resting-state brain networks—regions which spontaneously coactivate. However, despite the interest in the anatomical mechanisms and biobehavioral correlates of stimulus-free brain dynamics, little is known regarding the relation between spontaneous brain dynamics and task-evoked activity. In particular, no computational framework has been previously proposed to unite spontaneous and task dynamics under a single, data-driven model. Model development in this domain will provide new insight regarding the mechanisms by which exogeneous stimuli and intrinsic neural circuitry interact to shape human cognition. The current work bridges this gap by deriving and validating a new technique, termed Mesoscale Individualized NeuroDynamic (MINDy) modeling, to estimate large-scale neural population models for individual human subjects using resting-state fMRI. A combination of ground-truth simulations and test-retest data are used to demonstrate that the approach is robust to various forms of noise, motion, and data processing choices. The MINDy formalism is then extended to simultaneously estimating neural population models and the neurovascular coupling which gives rise to BOLD fMRI. In doing so, I develop and validate a new optimization framework for simultaneously estimating system states and parameters. Lastly, MINDy models derived from resting-state data are used to predict task-based activity and remove the effects of intrinsic dynamics. Removing the MINDy model predictions from task fMRI, enables separation of exogenously-driven components of activity from their indirect consequences (the model predictions). Results demonstrate that removing the predicted intrinsic dynamics improves detection of event-triggered and sustained responses across four cognitive tasks. Together, these findings validate the MINDy framework and demonstrate that MINDy models predict brain dynamics across contexts. These dynamics contribute to the variance of task-evoked brain activity between subjects. Removing the influence of intrinsic dynamics improves the estimation of task effects

    On consciousness, resting state fMRI, and neurodynamics

    Get PDF

    A Review on Deep Learning in Medical Image Reconstruction

    Full text link
    Medical imaging is crucial in modern clinics to guide the diagnosis and treatment of diseases. Medical image reconstruction is one of the most fundamental and important components of medical imaging, whose major objective is to acquire high-quality medical images for clinical usage at the minimal cost and risk to the patients. Mathematical models in medical image reconstruction or, more generally, image restoration in computer vision, have been playing a prominent role. Earlier mathematical models are mostly designed by human knowledge or hypothesis on the image to be reconstructed, and we shall call these models handcrafted models. Later, handcrafted plus data-driven modeling started to emerge which still mostly relies on human designs, while part of the model is learned from the observed data. More recently, as more data and computation resources are made available, deep learning based models (or deep models) pushed the data-driven modeling to the extreme where the models are mostly based on learning with minimal human designs. Both handcrafted and data-driven modeling have their own advantages and disadvantages. One of the major research trends in medical imaging is to combine handcrafted modeling with deep modeling so that we can enjoy benefits from both approaches. The major part of this article is to provide a conceptual review of some recent works on deep modeling from the unrolling dynamics viewpoint. This viewpoint stimulates new designs of neural network architectures with inspirations from optimization algorithms and numerical differential equations. Given the popularity of deep modeling, there are still vast remaining challenges in the field, as well as opportunities which we shall discuss at the end of this article.Comment: 31 pages, 6 figures. Survey pape

    Leveraging the Hankel norm approximation and block-AAA algorithms in reduced order modeling

    Full text link
    Large-scale linear, time-invariant (LTI) dynamical systems are widely used to characterize complicated physical phenomena. We propose a two-stage algorithm to reduce the order of a large-scale LTI system given samples of its transfer function for a target degree kk of the reduced system. In the first stage, a modified adaptive Antoulas--Anderson (AAA) algorithm is used to construct a degree dd rational approximation of the transfer function that corresponds to an intermediate system, which can be numerically stably reduced in the second stage using ideas from the theory on Hankel norm approximation (HNA). We also study the numerical issues of Glover's HNA algorithm and provide a remedy for its numerical instabilities. A carefully computed rational approximation of degree dd gives us a numerically stable algorithm for reducing an LTI system, which is more efficient than SVD-based algorithms and more accurate than moment-matching algorithms.Comment: 25 pages, 5 figure

    Large Scale Inverse Problems

    Get PDF
    This book is thesecond volume of a three volume series recording the "Radon Special Semester 2011 on Multiscale Simulation &amp Analysis in Energy and the Environment" that took placein Linz, Austria, October 3-7, 2011. This volume addresses the common ground in the mathematical and computational procedures required for large-scale inverse problems and data assimilation in forefront applications. The solution of inverse problems is fundamental to a wide variety of applications such as weather forecasting, medical tomography, and oil exploration. Regularisation techniques are needed to ensure solutions of sufficient quality to be useful, and soundly theoretically based. This book addresses the common techniques required for all the applications, and is thus truly interdisciplinary. This collection of survey articles focusses on the large inverse problems commonly arising in simulation and forecasting in the earth sciences

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Data-driven sub-grid model development for large eddy simulations of turbulence

    Get PDF
    Turbulence modeling remains an active area of research due to its significant impact on a diverse set of challenges such as those pertaining to the aerospace and geophysical communities. Researchers continue to search for modeling strategies that improve the representation of high-wavenumber content in practical computational fluid dynamics applications. The recent successes of machine learning in the physical sciences have motivated a number of studies into the modeling of turbulence from a data-driven point of view. In this research, we utilize physics-informed machine learning to reconstruct the effect of unresolved frequencies (i.e., small-scale turbulence) on grid-resolved flow-variables obtained through large eddy simulation. In general, it is seen that the successful development of any data-driven strategy relies on two phases - learning and a-posteriori deployment. The former requires the synthesis of labeled data from direct numerical simulations of our target phenomenon whereas the latter requires the development of stability preserving modifications instead of a direct deployment of learning predictions. These stability preserving techniques may be through prediction modulation - where learning outputs are deployed via an intermediate statistical truncation. They may also be through the utilization of model classifiers where the traditional L2L_2-minimization strategy is avoided for a categorical cross-entropy error which flags for the most stable model deployment at a point on the computational grid. In this thesis, we outline several investigations utilizing the aforementioned philosophies and come to the conclusion that sub-grid turbulence models built through the utilization of machine learning are capable of recovering viable statistical trends in stabilized a-posteriori deployments for Kraichnan and Kolmogorov turbulence. Therefore, they represent a promising tool for the generation of closures that may be utilized in flows that belong to different configurations and have different sub-grid modeling requirements
    • …
    corecore