11,596 research outputs found
On the Computational Cost and Complexity of Stochastic Inverse Solvers
The goal of this paper is to provide a starting point for investigations into a mainly underdeveloped area of research regarding the computational cost analysis of complex stochastic strategies for solving parametric inverse problems. This area has two main components: solving global optimization problems and solving forward problems (to evaluate the misfit function that we try to minimize). For the first component, we pay particular attention to genetic algorithms with heuristics and to multi-deme algorithms that can be modeled as ergodic Markov chains. We recall a simple method for evaluating the first hitting time for the single-deme algorithm and we extend it to the case of HGS, a multi-deme hierarchic strategy. We focus on the case in which at least the demes in the leaves are well tuned. Finally, we also express the problems of finding local and global optima in terms of a classic complexity theory. We formulate the natural result that finding a local optimum of a function is an NP-complete task, and we argue that finding a global optimum is a much harder, DP-complete, task. Furthermore, we argue that finding all global optima is, possibly, even harder (#P-hard) task. Regarding the second component of solving parametric inverse problems (i.e., regarding the forward problem solvers), we discuss the computational cost of hp-adaptive Finite Element solvers and their rates of convergence with respect to the increasing number of degrees of freedom. The presented results provide a useful taxonomy of problems and methods of studying the computational cost and complexity of various strategies for solving inverse parametric problems. Yet, we stress that our goal was not to deliver detailed evaluations for particular algorithms applied to particular inverse problems, but rather to try to identify possible ways of obtaining such results
Research and Education in Computational Science and Engineering
Over the past two decades the field of computational science and engineering
(CSE) has penetrated both basic and applied research in academia, industry, and
laboratories to advance discovery, optimize systems, support decision-makers,
and educate the scientific and engineering workforce. Informed by centuries of
theory and experiment, CSE performs computational experiments to answer
questions that neither theory nor experiment alone is equipped to answer. CSE
provides scientists and engineers of all persuasions with algorithmic
inventions and software systems that transcend disciplines and scales. Carried
on a wave of digital technology, CSE brings the power of parallelism to bear on
troves of data. Mathematics-based advanced computing has become a prevalent
means of discovery and innovation in essentially all areas of science,
engineering, technology, and society; and the CSE community is at the core of
this transformation. However, a combination of disruptive
developments---including the architectural complexity of extreme-scale
computing, the data revolution that engulfs the planet, and the specialization
required to follow the applications to new frontiers---is redefining the scope
and reach of the CSE endeavor. This report describes the rapid expansion of CSE
and the challenges to sustaining its bold advances. The report also presents
strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie
Small steps and giant leaps: Minimal Newton solvers for Deep Learning
We propose a fast second-order method that can be used as a drop-in
replacement for current deep learning solvers. Compared to stochastic gradient
descent (SGD), it only requires two additional forward-mode automatic
differentiation operations per iteration, which has a computational cost
comparable to two standard forward passes and is easy to implement. Our method
addresses long-standing issues with current second-order solvers, which invert
an approximate Hessian matrix every iteration exactly or by conjugate-gradient
methods, a procedure that is both costly and sensitive to noise. Instead, we
propose to keep a single estimate of the gradient projected by the inverse
Hessian matrix, and update it once per iteration. This estimate has the same
size and is similar to the momentum variable that is commonly used in SGD. No
estimate of the Hessian is maintained. We first validate our method, called
CurveBall, on small problems with known closed-form solutions (noisy Rosenbrock
function and degenerate 2-layer linear networks), where current deep learning
solvers seem to struggle. We then train several large models on CIFAR and
ImageNet, including ResNet and VGG-f networks, where we demonstrate faster
convergence with no hyperparameter tuning. Code is available
A multi-resolution, non-parametric, Bayesian framework for identification of spatially-varying model parameters
This paper proposes a hierarchical, multi-resolution framework for the
identification of model parameters and their spatially variability from noisy
measurements of the response or output. Such parameters are frequently
encountered in PDE-based models and correspond to quantities such as density or
pressure fields, elasto-plastic moduli and internal variables in solid
mechanics, conductivity fields in heat diffusion problems, permeability fields
in fluid flow through porous media etc. The proposed model has all the
advantages of traditional Bayesian formulations such as the ability to produce
measures of confidence for the inferences made and providing not only
predictive estimates but also quantitative measures of the predictive
uncertainty. In contrast to existing approaches it utilizes a parsimonious,
non-parametric formulation that favors sparse representations and whose
complexity can be determined from the data. The proposed framework in
non-intrusive and makes use of a sequence of forward solvers operating at
various resolutions. As a result, inexpensive, coarse solvers are used to
identify the most salient features of the unknown field(s) which are
subsequently enriched by invoking solvers operating at finer resolutions. This
leads to significant computational savings particularly in problems involving
computationally demanding forward models but also improvements in accuracy. It
is based on a novel, adaptive scheme based on Sequential Monte Carlo sampling
which is embarrassingly parallelizable and circumvents issues with slow mixing
encountered in Markov Chain Monte Carlo schemes
Tensor Computation: A New Framework for High-Dimensional Problems in EDA
Many critical EDA problems suffer from the curse of dimensionality, i.e. the
very fast-scaling computational burden produced by large number of parameters
and/or unknown variables. This phenomenon may be caused by multiple spatial or
temporal factors (e.g. 3-D field solvers discretizations and multi-rate circuit
simulation), nonlinearity of devices and circuits, large number of design or
optimization parameters (e.g. full-chip routing/placement and circuit sizing),
or extensive process variations (e.g. variability/reliability analysis and
design for manufacturability). The computational challenges generated by such
high dimensional problems are generally hard to handle efficiently with
traditional EDA core algorithms that are based on matrix and vector
computation. This paper presents "tensor computation" as an alternative general
framework for the development of efficient EDA algorithms and tools. A tensor
is a high-dimensional generalization of a matrix and a vector, and is a natural
choice for both storing and solving efficiently high-dimensional EDA problems.
This paper gives a basic tutorial on tensors, demonstrates some recent examples
of EDA applications (e.g., nonlinear circuit modeling and high-dimensional
uncertainty quantification), and suggests further open EDA problems where the
use of tensor computation could be of advantage.Comment: 14 figures. Accepted by IEEE Trans. CAD of Integrated Circuits and
System
- …