251 research outputs found
From Symmetry to Geometry: Tractable Nonconvex Problems
As science and engineering have become increasingly data-driven, the role of
optimization has expanded to touch almost every stage of the data analysis
pipeline, from the signal and data acquisition to modeling and prediction. The
optimization problems encountered in practice are often nonconvex. While
challenges vary from problem to problem, one common source of nonconvexity is
nonlinearity in the data or measurement model. Nonlinear models often exhibit
symmetries, creating complicated, nonconvex objective landscapes, with multiple
equivalent solutions. Nevertheless, simple methods (e.g., gradient descent)
often perform surprisingly well in practice.
The goal of this survey is to highlight a class of tractable nonconvex
problems, which can be understood through the lens of symmetries. These
problems exhibit a characteristic geometric structure: local minimizers are
symmetric copies of a single "ground truth" solution, while other critical
points occur at balanced superpositions of symmetric copies of the ground
truth, and exhibit negative curvature in directions that break the symmetry.
This structure enables efficient methods to obtain global minimizers. We
discuss examples of this phenomenon arising from a wide range of problems in
imaging, signal processing, and data analysis. We highlight the key role of
symmetry in shaping the objective landscape and discuss the different roles of
rotational and discrete symmetries. This area is rich with observed phenomena
and open problems; we close by highlighting directions for future research.Comment: review paper submitted to SIAM Review, 34 pages, 10 figure
Relating Spontaneous Activity and Cognitive States via NeuroDynamic Modeling
Stimulus-free brain dynamics form the basis of current knowledge concerning functional integration and segregation within the human brain. These relationships are typically described in terms of resting-state brain networks—regions which spontaneously coactivate. However, despite the interest in the anatomical mechanisms and biobehavioral correlates of stimulus-free brain dynamics, little is known regarding the relation between spontaneous brain dynamics and task-evoked activity. In particular, no computational framework has been previously proposed to unite spontaneous and task dynamics under a single, data-driven model. Model development in this domain will provide new insight regarding the mechanisms by which exogeneous stimuli and intrinsic neural circuitry interact to shape human cognition. The current work bridges this gap by deriving and validating a new technique, termed Mesoscale Individualized NeuroDynamic (MINDy) modeling, to estimate large-scale neural population models for individual human subjects using resting-state fMRI. A combination of ground-truth simulations and test-retest data are used to demonstrate that the approach is robust to various forms of noise, motion, and data processing choices. The MINDy formalism is then extended to simultaneously estimating neural population models and the neurovascular coupling which gives rise to BOLD fMRI. In doing so, I develop and validate a new optimization framework for simultaneously estimating system states and parameters. Lastly, MINDy models derived from resting-state data are used to predict task-based activity and remove the effects of intrinsic dynamics. Removing the MINDy model predictions from task fMRI, enables separation of exogenously-driven components of activity from their indirect consequences (the model predictions). Results demonstrate that removing the predicted intrinsic dynamics improves detection of event-triggered and sustained responses across four cognitive tasks. Together, these findings validate the MINDy framework and demonstrate that MINDy models predict brain dynamics across contexts. These dynamics contribute to the variance of task-evoked brain activity between subjects. Removing the influence of intrinsic dynamics improves the estimation of task effects
A Review on Deep Learning in Medical Image Reconstruction
Medical imaging is crucial in modern clinics to guide the diagnosis and
treatment of diseases. Medical image reconstruction is one of the most
fundamental and important components of medical imaging, whose major objective
is to acquire high-quality medical images for clinical usage at the minimal
cost and risk to the patients. Mathematical models in medical image
reconstruction or, more generally, image restoration in computer vision, have
been playing a prominent role. Earlier mathematical models are mostly designed
by human knowledge or hypothesis on the image to be reconstructed, and we shall
call these models handcrafted models. Later, handcrafted plus data-driven
modeling started to emerge which still mostly relies on human designs, while
part of the model is learned from the observed data. More recently, as more
data and computation resources are made available, deep learning based models
(or deep models) pushed the data-driven modeling to the extreme where the
models are mostly based on learning with minimal human designs. Both
handcrafted and data-driven modeling have their own advantages and
disadvantages. One of the major research trends in medical imaging is to
combine handcrafted modeling with deep modeling so that we can enjoy benefits
from both approaches. The major part of this article is to provide a conceptual
review of some recent works on deep modeling from the unrolling dynamics
viewpoint. This viewpoint stimulates new designs of neural network
architectures with inspirations from optimization algorithms and numerical
differential equations. Given the popularity of deep modeling, there are still
vast remaining challenges in the field, as well as opportunities which we shall
discuss at the end of this article.Comment: 31 pages, 6 figures. Survey pape
Leveraging the Hankel norm approximation and block-AAA algorithms in reduced order modeling
Large-scale linear, time-invariant (LTI) dynamical systems are widely used to
characterize complicated physical phenomena. We propose a two-stage algorithm
to reduce the order of a large-scale LTI system given samples of its transfer
function for a target degree of the reduced system. In the first stage, a
modified adaptive Antoulas--Anderson (AAA) algorithm is used to construct a
degree rational approximation of the transfer function that corresponds to
an intermediate system, which can be numerically stably reduced in the second
stage using ideas from the theory on Hankel norm approximation (HNA). We also
study the numerical issues of Glover's HNA algorithm and provide a remedy for
its numerical instabilities. A carefully computed rational approximation of
degree gives us a numerically stable algorithm for reducing an LTI system,
which is more efficient than SVD-based algorithms and more accurate than
moment-matching algorithms.Comment: 25 pages, 5 figure
Large Scale Inverse Problems
This book is thesecond volume of a three volume series recording the "Radon Special Semester 2011 on Multiscale Simulation & Analysis in Energy and the Environment" that took placein Linz, Austria, October 3-7, 2011. This volume addresses the common ground in the mathematical and computational procedures required for large-scale inverse problems and data assimilation in forefront applications. The solution of inverse problems is fundamental to a wide variety of applications such as weather forecasting, medical tomography, and oil exploration. Regularisation techniques are needed to ensure solutions of sufficient quality to be useful, and soundly theoretically based. This book addresses the common techniques required for all the applications, and is thus truly interdisciplinary. This collection of survey articles focusses on the large inverse problems commonly arising in simulation and forecasting in the earth sciences
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Data-driven sub-grid model development for large eddy simulations of turbulence
Turbulence modeling remains an active area of research due to its significant impact on a diverse set of challenges such as those pertaining to the aerospace and geophysical communities. Researchers continue to search for modeling strategies that improve the representation of high-wavenumber content in practical computational fluid dynamics applications. The recent successes of machine learning in the physical sciences have motivated a number of studies into the modeling of turbulence from a data-driven point of view. In this research, we utilize physics-informed machine learning to reconstruct the effect of unresolved frequencies (i.e., small-scale turbulence) on grid-resolved flow-variables obtained through large eddy simulation. In general, it is seen that the successful development of any data-driven strategy relies on two phases - learning and a-posteriori deployment. The former requires the synthesis of labeled data from direct numerical simulations of our target phenomenon whereas the latter requires the development of stability preserving modifications instead of a direct deployment of learning predictions. These stability preserving techniques may be through prediction modulation - where learning outputs are deployed via an intermediate statistical truncation. They may also be through the utilization of model classifiers where the traditional -minimization strategy is avoided for a categorical cross-entropy error which flags for the most stable model deployment at a point on the computational grid. In this thesis, we outline several investigations utilizing the aforementioned philosophies and come to the conclusion that sub-grid turbulence models built through the utilization of machine learning are capable of recovering viable statistical trends in stabilized a-posteriori deployments for Kraichnan and Kolmogorov turbulence. Therefore, they represent a promising tool for the generation of closures that may be utilized in flows that belong to different configurations and have different sub-grid modeling requirements
- …