305 research outputs found

    Proximal Galerkin: A structure-preserving finite element method for pointwise bound constraints

    Full text link
    The proximal Galerkin finite element method is a high-order, low iteration complexity, nonlinear numerical method that preserves the geometric and algebraic structure of bound constraints in infinite-dimensional function spaces. This paper introduces the proximal Galerkin method and applies it to solve free boundary problems, enforce discrete maximum principles, and develop scalable, mesh-independent algorithms for optimal design. The paper leads to a derivation of the latent variable proximal point (LVPP) algorithm: an unconditionally stable alternative to the interior point method. LVPP is an infinite-dimensional optimization algorithm that may be viewed as having an adaptive barrier function that is updated with a new informative prior at each (outer loop) optimization iteration. One of the main benefits of this algorithm is witnessed when analyzing the classical obstacle problem. Therein, we find that the original variational inequality can be replaced by a sequence of semilinear partial differential equations (PDEs) that are readily discretized and solved with, e.g., high-order finite elements. Throughout this work, we arrive at several unexpected contributions that may be of independent interest. These include (1) a semilinear PDE we refer to as the entropic Poisson equation; (2) an algebraic/geometric connection between high-order positivity-preserving discretizations and certain infinite-dimensional Lie groups; and (3) a gradient-based, bound-preserving algorithm for two-field density-based topology optimization. The complete latent variable proximal Galerkin methodology combines ideas from nonlinear programming, functional analysis, tropical algebra, and differential geometry and can potentially lead to new synergies among these areas as well as within variational and numerical analysis

    Partial Differential Equation-Constrained Diffeomorphic Registration from Sum of Squared Differences to Normalized Cross-Correlation, Normalized Gradient Fields, and Mutual Information: A Unifying Framework; 35632143

    Get PDF
    This work proposes a unifying framework for extending PDE-constrained Large Deformation Diffeomorphic Metric Mapping (PDE-LDDMM) with the sum of squared differences (SSD) to PDE-LDDMM with different image similarity metrics. We focused on the two best-performing variants of PDE-LDDMM with the spatial and band-limited parameterizations of diffeomorphisms. We derived the equations for gradient-descent and Gauss-Newton-Krylov (GNK) optimization with Normalized Cross-Correlation (NCC), its local version (lNCC), Normalized Gradient Fields (NGFs), and Mutual Information (MI). PDE-LDDMM with GNK was successfully implemented for NCC and lNCC, substantially improving the registration results of SSD. For these metrics, GNK optimization outperformed gradient-descent. However, for NGFs, GNK optimization was not able to overpass the performance of gradient-descent. For MI, GNK optimization involved the product of huge dense matrices, requesting an unaffordable memory load. The extensive evaluation reported the band-limited version of PDE-LDDMM based on the deformation state equation with NCC and lNCC image similarities among the best performing PDE-LDDMM methods. In comparison with benchmark deep learning-based methods, our proposal reached or surpassed the accuracy of the best-performing models. In NIREP16, several configurations of PDE-LDDMM outperformed ANTS-lNCC, the best benchmark method. Although NGFs and MI usually underperformed the other metrics in our evaluation, these metrics showed potentially competitive results in a multimodal deformable experiment. We believe that our proposed image similarity extension over PDE-LDDMM will promote the use of physically meaningful diffeomorphisms in a wide variety of clinical applications depending on deformable image registration

    An Exploration of Controlling the Content Learned by Deep Neural Networks

    Get PDF
    With the great success of the Deep Neural Network (DNN), how to get a trustworthy model attracts more and more attention. Generally, people intend to provide the raw data to the DNN directly in training. However, the entire training process is in a black box, in which the knowledge learned by the DNN is out of control. There are many risks inside. The most common one is overfitting. With the deepening of research on neural networks, additional and probably greater risks were discovered recently. The related research shows that unknown clues can hide in the training data because of the randomization of the data and the finite scale of the training data. Some of the clues build meaningless but explicit links between input data the output data called ``shortcuts\u27\u27. The DNN makes the decision based on these ``shortcuts\u27\u27. This phenomenon is also called ``network cheating\u27\u27. The knowledge of such shortcuts learned by DNN ruins all the training and makes the performance of the DNN unreliable. Therefore, we need to control the raw data using in training. Here, we name the explicit raw data as ``content\u27\u27 and the implicit logic learned by the DNN as ``knowledge\u27\u27 in this dissertation. By quantifying the information in DNN\u27s training, we find that the information learned by the network is much less than the information contained in the dataset. It indicates that it is unnecessary to train the neural network with all of the information, which means using partial information for training can also achieve a similar effect of using full information. In other words, it is possible to control the content fed into the DNN, and this strategy shown in this study can reduce the risks (e.g., overfitting and shortcuts) mentioned above. Moreover, use reconstructed data (with partial information) to train the network can reduce the complexity of the network and accelerate the training. In this dissertation, we provide a pipeline to implement content control in DNN\u27s training. We use a series of experiments to prove its feasibility in two applications. One is human brain anatomy structure analysis, and the other is human pose detection and classification

    Semi-supervised Learning of Pushforwards For Domain Translation & Adaptation

    Full text link
    Given two probability densities on related data spaces, we seek a map pushing one density to the other while satisfying application-dependent constraints. For maps to have utility in a broad application space (including domain translation, domain adaptation, and generative modeling), the map must be available to apply on out-of-sample data points and should correspond to a probabilistic model over the two spaces. Unfortunately, existing approaches, which are primarily based on optimal transport, do not address these needs. In this paper, we introduce a novel pushforward map learning algorithm that utilizes normalizing flows to parameterize the map. We first re-formulate the classical optimal transport problem to be map-focused and propose a learning algorithm to select from all possible maps under the constraint that the map minimizes a probability distance and application-specific regularizers; thus, our method can be seen as solving a modified optimal transport problem. Once the map is learned, it can be used to map samples from a source domain to a target domain. In addition, because the map is parameterized as a composition of normalizing flows, it models the empirical distributions over the two data spaces and allows both sampling and likelihood evaluation for both data sets. We compare our method (parOT) to related optimal transport approaches in the context of domain adaptation and domain translation on benchmark data sets. Finally, to illustrate the impact of our work on applied problems, we apply parOT to a real scientific application: spectral calibration for high-dimensional measurements from two vastly different environmentsComment: 19 pages, 7 figure
    • …