1,812 research outputs found
Multilevel Artificial Neural Network Training for Spatially Correlated Learning
Multigrid modeling algorithms are a technique used to accelerate relaxation
models running on a hierarchy of similar graphlike structures. We introduce and
demonstrate a new method for training neural networks which uses multilevel
methods. Using an objective function derived from a graph-distance metric, we
perform orthogonally-constrained optimization to find optimal prolongation and
restriction maps between graphs. We compare and contrast several methods for
performing this numerical optimization, and additionally present some new
theoretical results on upper bounds of this type of objective function. Once
calculated, these optimal maps between graphs form the core of Multiscale
Artificial Neural Network (MsANN) training, a new procedure we present which
simultaneously trains a hierarchy of neural network models of varying spatial
resolution. Parameter information is passed between members of this hierarchy
according to standard coarsening and refinement schedules from the multiscale
modelling literature. In our machine learning experiments, these models are
able to learn faster than default training, achieving a comparable level of
error in an order of magnitude fewer training examples.Comment: Manuscript (24 pages) and Supplementary Material (4 pages). Updated
January 2019 to reflect new formulation of MsANN structure and new training
procedur
A multilevel approach for nonnegative matrix factorization
Nonnegative Matrix Factorization (NMF) is the problem of approximating a nonnegative matrix with the product of two low-rank nonnegative matrices and has been shown to be particularly useful in many applications, e.g., in text mining, image processing, computational biology, etc. In this paper, we explain how algorithms for NMF can be embedded into the framework of multi- level methods in order to accelerate their convergence. This technique can be applied in situations where data admit a good approximate representation in a lower dimensional space through linear transformations preserving nonnegativity. A simple multilevel strategy is described and is experi- mentally shown to speed up significantly three popular NMF algorithms (alternating nonnegative least squares, multiplicative updates and hierarchical alternating least squares) on several standard image datasets.nonnegative matrix factorization, algorithms, multigrid and multilevel methods, image processing
Deep Residual Learning for Image Recognition
Deeper neural networks are more difficult to train. We present a residual
learning framework to ease the training of networks that are substantially
deeper than those used previously. We explicitly reformulate the layers as
learning residual functions with reference to the layer inputs, instead of
learning unreferenced functions. We provide comprehensive empirical evidence
showing that these residual networks are easier to optimize, and can gain
accuracy from considerably increased depth. On the ImageNet dataset we evaluate
residual nets with a depth of up to 152 layers---8x deeper than VGG nets but
still having lower complexity. An ensemble of these residual nets achieves
3.57% error on the ImageNet test set. This result won the 1st place on the
ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100
and 1000 layers.
The depth of representations is of central importance for many visual
recognition tasks. Solely due to our extremely deep representations, we obtain
a 28% relative improvement on the COCO object detection dataset. Deep residual
nets are foundations of our submissions to ILSVRC & COCO 2015 competitions,
where we also won the 1st places on the tasks of ImageNet detection, ImageNet
localization, COCO detection, and COCO segmentation.Comment: Tech repor
Recommended from our members
Fast algorithms for biophysically-constrained inverse problems in medical imaging
We present algorithms and software for parameter estimation for forward and inverse tumor growth problems and diffeomorphic image registration. Our methods target the following scenarios: automatic image registration of healthy images to tumor bearing medical images and parameter estimation/calibration of tumor models. This thesis focuses on robust and scalable algorithms for these problems.
Although the proposed framework applies to many problems in oncology, we focus on primary brain tumors and in particular low and high-grade gliomas. For the tumor model, the main quantity of interest is the extent of tumor infiltration into the brain, beyond what is visible in imaging.
The inverse tumor problem assumes that we have patient images at two (or more) well-separated times so that we can observe the tumor growth. Also, the inverse problem requires that the two images are segmented. But in a clinical setting such information is usually not available. In a typical case, we just have multimodal magnetic resonance images with no segmentation. We address this lack of information by solving a coupled inverse registration and tumor problem. The role of image registration is to find a plausible mapping between the patient's
tumor-bearing image and a normal brain (atlas), with known segmentation. Solving this coupled inverse problem has a prohibitive computational cost, especially in 3D. To address this challenge we have developed novel schemes, scaled up to 200K cores.
Our main contributions is the design and implementation of fast solvers for these problems. We also study the performance for the tumor parameter estimation and registration solvers and their algorithmic scalability. In particular, we introduce the following novel algorithms: An adjoint formulation for tumor-growth problems with/without mass-effect; The first parallel 3D Newton-Krylov method for large diffeomorphic image registration; A novel parallel semi-Lagrangian algorithm for solving advection equations in image registration and its parallel implementation on shared and distributed memory architectures; and Accelerated FFT (AccFFT), an open-source parallel FFT library for CPU and GPUs scaled up to 131,000 cores with optimized kernels for computing spectral operators.
The scientific outcomes of this thesis, has appeared in the proceedings of three ACM/IEEE SCxy conferences (two best student paper finalist, and one ACM SRC gold medal), two journal papers, two papers in review, four papers in preparation (coupling, mass effect, segmentation, and multi-species tumor model), and seven conference presentations.Computational Science, Engineering, and Mathematic
Research and Education in Computational Science and Engineering
Over the past two decades the field of computational science and engineering
(CSE) has penetrated both basic and applied research in academia, industry, and
laboratories to advance discovery, optimize systems, support decision-makers,
and educate the scientific and engineering workforce. Informed by centuries of
theory and experiment, CSE performs computational experiments to answer
questions that neither theory nor experiment alone is equipped to answer. CSE
provides scientists and engineers of all persuasions with algorithmic
inventions and software systems that transcend disciplines and scales. Carried
on a wave of digital technology, CSE brings the power of parallelism to bear on
troves of data. Mathematics-based advanced computing has become a prevalent
means of discovery and innovation in essentially all areas of science,
engineering, technology, and society; and the CSE community is at the core of
this transformation. However, a combination of disruptive
developments---including the architectural complexity of extreme-scale
computing, the data revolution that engulfs the planet, and the specialization
required to follow the applications to new frontiers---is redefining the scope
and reach of the CSE endeavor. This report describes the rapid expansion of CSE
and the challenges to sustaining its bold advances. The report also presents
strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie
- …