223 research outputs found
Extracting Vessel Structure From 3D Image Data
This thesis is focused on extracting the structure of vessels from 3D cardiac images. In many biomedical applications it is important to segment the vessels preserving their anatomically-correct topological structure. That is, the final result should form a tree. There are many technical challenges when solving this image analysis problem: noise, outliers, partial volume. In particular, standard segmentation methods are known to have problems with extracting thin structures and with enforcing topological constraints. All these issues explain why vessel segmentation remains an unsolved problem despite years of research.
Our new efforts combine recent advances in optimization-based methods for image analysis with the state-or-the-art vessel filtering techniques. We apply multiple vessel enhancement filters to the raw 3D data in order to reduce the rings artifacts as well as the noise. After that, we tested two different methods for extracting the structure of vessels centrelines. First, we use data thinning technique which is inspired by Canny edge detector. Second, we apply recent optimization-based line fitting algorithm to represent the structure of the centrelines as a piecewise smooth collection of line intervals. Finally, we enforce a tree structure using minimum spanning tree algorithm
Computational and numerical aspects of full waveform seismic inversion
Full-waveform inversion (FWI) is a nonlinear optimisation procedure, seeking to match synthetically-generated seismograms with those observed in field data by iteratively updating a model of the subsurface seismic parameters, typically compressional wave (P-wave) velocity.
Advances in high-performance computing have made FWI of 3-dimensional models feasible, but the low sensitivity of the objective function to deeper, low-wavenumber components of velocity makes these difficult to recover using FWI relative to more traditional, less automated, techniques.
While the use of inadequate physics during the synthetic modelling stage is a contributing factor, I propose that this weakness is substantially one of ill-conditioning, and that efforts to remedy it should focus on the development of both more efficient seismic modelling techniques, and more sophisticated preconditioners for the optimisation iterations. I demonstrate that the problem of poor low-wavenumber velocity recovery can be reproduced in an analogous one-dimensional inversion problem, and that in this case it can be remedied by making full use of the available curvature information, in the form of the Hessian matrix.
In two or three dimensions, this curvature information is prohibitively expensive to obtain and store as part of an inversion procedure. I obtain the complete Hessian matrices for a realistically-sized, two-dimensional, towed-streamer inversion problem at several stages during the inversion and link properties of these matrices to the behaviour of the inversion. Based on these observations, I propose a method for approximating the action of the Hessian and suggest it as a path forward for more sophisticated preconditioning of the inversion process.Open Acces
Parallel problem generation for structured problems in mathematical programming
The aim of this research is to investigate parallel problem generation for structured
optimization problems. The result of this research has produced a novel
parallel model generator tool, namely the Parallel Structured Model Generator
(PSMG). PSMG adopts the model syntax from SML to attain backward compatibility
for the models already written in SML [1]. Unlike the proof-of-concept
implementation for SML in [2], PSMG does not depend on AMPL [3].
In this thesis, we firstly explain what a structured problem is using concrete
real-world problems modelled in SML. Presenting those example models allows
us to exhibit PSMG’s modelling syntax and techniques in detail. PSMG provides
an easy to use framework for modelling large scale nested structured problems
including multi-stage stochastic problems. PSMG can be used for modelling linear
programming (LP), quadratic programming (QP), and nonlinear programming
(NLP) problems.
The second part of this thesis describes considerable thoughts on logical calling
sequence and dependencies in parallel operation and algorithms in PSMG.
We explain the design concept for PSMG’s solver interface. The interface follows
a solver driven work assignment approach that allows the solver to decide how
to distribute problem parts to processors in order to obtain better data locality
and load balancing for solving problems in parallel. PSMG adopts a delayed
constraint expansion design. This allows the memory allocation for computed
entities to only happen on a process when it is necessary. The computed entities
can be the set expansions of the indexing expressions associated with the
variable, parameter and constraint declarations, or temporary values used for set
and parameter constructions. We also illustrate algorithms that are important
for delivering efficient implementation of PSMG, such as routines for partitioning
constraints according to blocks and automatic differentiation algorithms for evaluating
Jacobian and Hessian matrices and their corresponding sparsity partterns.
Furthermore, PSMG implements a generic solver interface which can be linked
with different structure exploiting optimization solvers such as decomposition or
interior point based solvers. The work required for linking with PSMG’s solver
interface is also discussed.
Finally, we evaluate PSMG’s run-time performance and memory usage by
generating structured problems with various sizes. The results from both serial
and parallel executions are discussed. The benchmark results show that PSMG
achieve good parallel efficiency on up to 96 processes. PSMG distributes memory
usage among parallel processors which enables the generation of problems that
are too large to be processed on a single node due to memory restriction
Sequence-to-sequence learning for machine translation and automatic differentiation for machine learning software tools
Cette thèse regroupe des articles d'apprentissage automatique et s'articule autour de deux thématiques complémentaires.
D'une part, les trois premiers articles examinent l'application des réseaux de neurones artificiels aux problèmes du traitement automatique du langage naturel (TALN). Le premier article introduit une structure codificatrice-décodificatrice avec des réseaux de neurones récurrents pour traduire des segments de phrases de longueur variable. Le deuxième article analyse la performance de ces modèles de `traduction neuronale automatique' de manière qualitative et quantitative, tout en soulignant les difficultés posées par les phrases longues et les mots rares. Le troisième article s'adresse au traitement des mots rares et hors du vocabulaire commun en combinant des algorithmes de compression par dictionnaire et des réseaux de neurones récurrents.
D'autre part, la deuxième partie de cette thèse fait abstraction de modèles particuliers de réseaux de neurones afin d'aborder l'infrastructure logicielle nécessaire à leur définition et entraînement. Les infrastructures modernes d'apprentissage profond doivent avoir la capacité d'exécuter efficacement des programmes d'algèbre linéaire et par tableaux, tout en étant capable de différentiation automatique (DA) pour calculer des dérivées multiples. Le premier article aborde les défis généraux posés par la conciliation de ces deux objectifs et propose la solution d'une représentation intermédiaire fondée sur les graphes. Le deuxième article attaque le même problème d'une manière différente: en implémentant un code source par bande dans un langage de programmation dynamique par tableau (Python et NumPy).This thesis consists of a series of articles that contribute to the field of machine learning. In particular, it covers two distinct and loosely related fields.
The first three articles consider the use of neural network models for problems in natural language processing (NLP). The first article introduces the use of an encoder-decoder structure involving recurrent neural networks (RNNs) to translate from and to variable length phrases and sentences. The second article contains a quantitative and qualitative analysis of the performance of these `neural machine translation' models, laying bare the difficulties posed by long sentences and rare words. The third article deals with handling rare and out-of-vocabulary words in neural network models by using dictionary coder compression algorithms and multi-scale RNN models.
The second half of this thesis does not deal with specific neural network models, but with the software tools and frameworks that can be used to define and train them. Modern deep learning frameworks need to be able to efficiently execute programs involving linear algebra and array programming, while also being able to employ automatic differentiation (AD) in order to calculate a variety of derivatives. The first article provides an overview of the difficulties posed in reconciling these two objectives, and introduces a graph-based intermediate representation that aims to tackle these difficulties. The second article considers a different approach to the same problem, implementing a tape-based source-code transformation approach to AD on a dynamically typed array programming language (Python and NumPy)
variPEPS -- a versatile tensor network library for variational ground state simulations in two spatial dimensions
Tensor networks capture large classes of ground states of phases of quantum
matter faithfully and efficiently. Their manipulation and contraction has
remained a challenge over the years, however. For most of the history, ground
state simulations of two-dimensional quantum lattice systems using (infinite)
projected entangled pair states have relied on what is called a time-evolving
block decimation. In recent years, multiple proposals for the variational
optimization of the quantum state have been put forward, overcoming accuracy
and convergence problems of previously known methods. The incorporation of
automatic differentiation in tensor networks algorithms has ultimately enabled
a new, flexible way for variational simulation of ground states and excited
states. In this work, we review the state of the art of the variational iPEPS
framework. We present and explain the functioning of an efficient,
comprehensive and general tensor network library for the simulation of infinite
two-dimensional systems using iPEPS, with support for flexible unit cells and
different lattice geometries
International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book
The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions.
This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more
Uncertainty quantification in ocean state estimation
Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2013Quantifying uncertainty and error bounds is a key outstanding challenge in ocean state
estimation and climate research. It is particularly difficult due to the large dimensionality
of this nonlinear estimation problem and the number of uncertain variables involved. The
“Estimating the Circulation and Climate of the Oceans” (ECCO) consortium has
developed a scalable system for dynamically consistent estimation of global time-evolving
ocean state by optimal combination of ocean general circulation model (GCM)
with diverse ocean observations. The estimation system is based on the "adjoint method"
solution of an unconstrained least-squares optimization problem formulated with the
method of Lagrange multipliers for fitting the dynamical ocean model to observations.
The dynamical consistency requirement of ocean state estimation necessitates this
approach over sequential data assimilation and reanalysis smoothing techniques. In
addition, it is computationally advantageous because calculation and storage of large
covariance matrices is not required. However, this is also a drawback of the adjoint
method, which lacks a native formalism for error propagation and quantification of
assimilated uncertainty. The objective of this dissertation is to resolve that limitation by
developing a feasible computational methodology for uncertainty analysis in dynamically
consistent state estimation, applicable to the large dimensionality of global ocean models.
Hessian (second derivative-based) methodology is developed for Uncertainty
Quantification (UQ) in large-scale ocean state estimation, extending the gradient-based
adjoint method to employ the second order geometry information of the model-data
misfit function in a high-dimensional control space. Large error covariance matrices are
evaluated by inverting the Hessian matrix with the developed scalable matrix-free
numerical linear algebra algorithms. Hessian-vector product and Jacobian derivative
codes of the MIT general circulation model (MITgcm) are generated by means of
algorithmic differentiation (AD). Computational complexity of the Hessian code is
reduced by tangent linear differentiation of the adjoint code, which preserves the speedup
of adjoint checkpointing schemes in the second derivative calculation. A Lanczos
algorithm is applied for extracting the leading rank eigenvectors and eigenvalues of the
Hessian matrix. The eigenvectors represent the constrained uncertainty patterns. The
inverse eigenvalues are the corresponding uncertainties. The dimensionality of UQ
calculations is reduced by eliminating the uncertainty null-space unconstrained by the
supplied observations. Inverse and forward uncertainty propagation schemes are designed
for assimilating observation and control variable uncertainties, and for projecting these
uncertainties onto oceanographic target quantities. Two versions of these schemes are
developed: one evaluates reduction of prior uncertainties, while another does not require
prior assumptions. The analysis of uncertainty propagation in the ocean model is time-resolving.
It captures the dynamics of uncertainty evolution and reveals transient and
stationary uncertainty regimes.
The system is applied to quantifying uncertainties of Antarctic Circumpolar Current
(ACC) transport in a global barotropic configuration of the MITgcm. The model is
constrained by synthetic observations of sea surface height and velocities. The control
space consists of two-dimensional maps of initial and boundary conditions and model
parameters. The size of the Hessian matrix is O(1010) elements, which would require
O(60GB) of uncompressed storage. It is demonstrated how the choice of observations
and their geographic coverage determines the reduction in uncertainties of the estimated
transport. The system also yields information on how well the control fields are
constrained by the observations. The effects of controls uncertainty reduction due to
decrease of diagonal covariance terms are compared to dynamical coupling of controls
through off-diagonal covariance terms. The correlations of controls introduced by
observation uncertainty assimilation are found to dominate the reduction of uncertainty of
transport. An idealized analytical model of ACC guides a detailed time-resolving
understanding of uncertainty dynamics.This thesis was supported in part by the National Science Foundation (NSF)
Collaboration in Mathematical Geosciences (CMG) grant ARC-0934404, and the
Department of Energy (DOE) ISICLES initiative under LANL sub-contract 139843-1.
Partial funding was provided by the department of Mechanical Engineering at MIT and
by the Academic Programs Office at WHOI. My participation in the IMA "Large-scale
Inverse Problems and Quantification of Uncertainty" workshop was partially funded by
IMA NSF grants
- …