45 research outputs found
Reduced order modeling of subsurface multiphase flow models using deep residual recurrent neural networks
We present a reduced order modeling (ROM) technique for subsurface
multi-phase flow problems building on the recently introduced deep residual
recurrent neural network (DR-RNN) [1]. DR-RNN is a physics aware recurrent
neural network for modeling the evolution of dynamical systems. The DR-RNN
architecture is inspired by iterative update techniques of line search methods
where a fixed number of layers are stacked together to minimize the residual
(or reduced residual) of the physical model under consideration. In this
manuscript, we combine DR-RNN with proper orthogonal decomposition (POD) and
discrete empirical interpolation method (DEIM) to reduce the computational
complexity associated with high-fidelity numerical simulations. In the
presented formulation, POD is used to construct an optimal set of reduced basis
functions and DEIM is employed to evaluate the nonlinear terms independent of
the full-order model size.
We demonstrate the proposed reduced model on two uncertainty quantification
test cases using Monte-Carlo simulation of subsurface flow with random
permeability field. The obtained results demonstrate that DR-RNN combined with
POD-DEIM provides an accurate and stable reduced model with a fixed
computational budget that is much less than the computational cost of standard
POD-Galerkin reduced model combined with DEIM for nonlinear dynamical systems
Uncertain linear equations
Ankara : The Department of Electrical and Electronics Engineering and the Institute of Engineering and Sciences of Bilkent University, 2010.Thesis (Master's) -- Bilkent University, 2010.Includes bibliographical references leaves 72-79.In this thesis, new theoretical and practical results on linear equations with various
types of uncertainties and their applications are presented. In the first part,
the case in which there are more equations than unknowns (overdetermined case)
is considered. A novel approach is proposed to provide robust and accurate estimates
of the solution of the linear equations when both the measurement vector
and the coefficient matrix are subject to uncertainty. A new analytic formulation
is developed in terms of the gradient flow to analyze and provide estimates to
the solution. The presented analysis enables us to study and compare existing
methods in literature. We derive theoretical bounds for the performance of our
estimator and show that if the signal-to-noise ratio is low than a treshold, a significant
improvement is made compared to the conventional estimator. Numerical
results in applications such as blind identification, multiple frequency estimation
and deconvolution show that the proposed technique outperforms alternative
methods in mean-squared error for a significant range of signal-to-noise ratio
values. The second type of uncertainty analyzed in the overdetermined case is
where uncertainty is sparse in some basis. We show that this type of uncertainty
on the coefficient matrix can be recovered exactly for a large class of structures,
if we have sufficiently many equations. We propose and solve an optimization criterion and its convex relaxation to recover the uncertainty and the solution to
the linear system. We derive sufficiency conditions for exact and stable recovery.
Then we demonstrate with numerical examples that the proposed method is
able to recover unknowns exactly with high probability. The performance of the
proposed technique is compared in estimation and tracking of sparse multipath
wireless channels. The second part of the thesis deals with the case where there
are more unknowns than equations (underdetermined case). We extend the theory
of polarization of Arikan for random variables with continuous distributions.
We show that the Hadamard Transform and the Discrete Fourier Transform, polarizes
the information content of independent identically distributed copies of
compressible random variables, where compressibility is measured by Shannon’s
differential entropy. Using these results we show that, the solution of the linear
system can be recovered even if there are more unknowns than equations if the
number of equations is sufficient to capture the entropy of the uncertainty. This
approach is applied to sampling compressible signals below the Nyquist rate and
coined ”Polar Sampling”. This result generalizes and unifies the sparse recovery
theory of Compressed Sensing by extending it to general low entropy signals with
an information theoretical analysis. We demonstrate the effectiveness of Polar
Sampling approach on a numerical sub-Nyquist sampling example.Pilancı, MertM.S
Multiscale aeroelastic modelling in porous composite structures
Driven by economic, environmental and ergonomic concerns, porous composites are increasingly being adopted by the aeronautical and structural engineering communities for their improved physical and mechanical properties. Such materials often possess highly heterogeneous material descriptions and tessellated/complex geometries. Deploying commercially viable porous composite structures necessitates numerical methods that are capable of accurately and efficiently handling these complexities within the prescribed design iterations. Classical numerical methods, such as the Finite Element Method (FEM), while extremely versatile, incur large computational costs when accounting for heterogeneous inclusions and high frequency waves. This often renders the problem prohibitively expensive, even with the advent of modern high performance computing facilities.
Multiscale Finite Element Methods (MsFEM) is an order reduction strategy specifically developed to address such issues. This is done by introducing meshes at different scales. All underlying physics and material descriptions are explicitly resolved at the fine scale. This information is then mapped onto the coarse scale through a set of numerically evaluated multiscale basis functions. The problems are then solved at the coarse scale at a significantly reduced cost and mapped back to the fine scale using the same multiscale shape functions. To this point, the MsFEM has been developed exclusively with quadrilateral/hexahedral coarse and fine elements. This proves highly inefficient when encountering complex coarse scale geometries and fine scale inclusions. A more flexible meshing scheme at all scales is essential for ensuring optimal simulation runtimes.
The Virtual Element Method (VEM) is a relatively recent development within the computational mechanics community aimed at handling arbitrary polygonal (potentially non-convex) elements. In this thesis, novel VEM formulations for poromechanical problems (consolidation and vibroacoustics) are developed. This is then integrated at the fine scale into the multiscale procedure to enable versatile meshing possibilities. Further, this enhanced capability is also extended to the coarse scale to allow for efficient macroscale discretizations of complex structures.
The resulting Multiscale Virtual Element Method (MsVEM) is originally applied to problems in elastostatics, consolidation and vibroacoustics in porous media to successfully drive down computational run times without significantly affecting accuracy. Following this, a parametric Model Order Reduction scheme for coupled problems is introduced for the first time at the fine scale to obtain a Reduced Basis Multiscale Virtual Element Method. This is used to augment the rate of multiscale basis function evaluation in spectral acoustics problems. The accuracy of all the above novel contributions are investigated in relation to standard numerical methods, i.e., the FEM and MsFEM, analytical solutions and experimental data. The associated efficiency is quantified in terms of computational run-times, complexity analyses and speed-up metrics.
Several extended applications of the VEM and the MsVEM are briefly visited, e.g., VEM phase field Methods for brittle fracture, structural and acoustical topology optimization, random vibrations and stochastic dynamics, and structural vibroacoustics
Deep Network Regularization with Representation Shaping
학위논문 (박사)-- 서울대학교 대학원 : 융합과학기술대학원 융합과학부(디지털정보융합전공), 2019. 2. Rhee, Wonjong.The statistical characteristics of learned representations such as correlation and representational sparsity are known to be relevant to the performance of deep learning methods. Also, learning meaningful and useful data representations by using regularization methods has been one of the central concerns in deep learning. In this dissertation, deep network regularization using representation shaping are studied. Roughly, the following questions are answered: what are the common statistical characteristics of representations that high-performing networks share? Do the characteristics have a causal relationship with performance? To answer the questions, five representation regularizers are proposed: class-wise Covariance Regularizer (cw-CR), Variance Regularizer (VR), class-wise Variance Regularizer (cw-VR), Rank Regularizer (RR), and class-wise Rank Regularizer (cw-RR). Significant performance improvements were found for a variety of tasks over popular benchmark datasets with the regularizers. The visualization of learned representations shows that the regularizers used in this work indeed perform distinct representation shaping. Then, with a variety of representation regularizers, a few statistical characteristics of learned representations including covariance, correlation, sparsity, dead unit, and rank are investigated. Our theoretical analysis and experimental results indicate that all the statistical characteristics considered in this work fail to show any general or causal pattern for improving performance. Mutual information I(zx) and I(zy) are examined as well, and it is shown that regularizers can affect I(zx) and thus indirectly influence the performance. Finally, two practical ways of using representation regularizers are presented to address the usefulness of representation regularizers: using a set of representation regularizers as a performance tuning tool and enhancing network compression with representation regularizers.Chapter 1. Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Chapter 2. Generalization, Regularization, and Representation in Deep Learning 8
2.1 Deep Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Capacity, Overfitting, and Generalization . . . . . . . . . . . 11
2.2.2 Generalization in Deep Learning . . . . . . . . . . . . . . . . 12
2.3 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Capacity Control and Regularization . . . . . . . . . . . . . . 14
2.3.2 Regularization for Deep Learning . . . . . . . . . . . . . . . 16
2.4 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.1 Representation Learning . . . . . . . . . . . . . . . . . . . . 18
2.4.2 Representation Shaping . . . . . . . . . . . . . . . . . . . . 20
Chapter 3. Representation Regularizer Design with Class Information 26
3.1 Class-wise Representation Regularizers: cw-CR and cw-VR . . . . . 27
3.1.1 Basic Statistics of Representations . . . . . . . . . . . . . . . 27
3.1.2 cw-CR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1.3 cw-VR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.4 Penalty Loss Functions and Gradients . . . . . . . . . . . . . 30
3.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.1 Image Classification Task . . . . . . . . . . . . . . . . . . . 33
3.2.2 Image Reconstruction Task . . . . . . . . . . . . . . . . . . . 36
3.3 Analysis of Representation Characteristics . . . . . . . . . . . . . . . 36
3.3.1 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.2 Quantitative Analysis . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Layer Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Chapter 4. Representation Characteristics and Their Relationship with Performance 42
4.1 Representation Characteristics . . . . . . . . . . . . . . . . . . . . . 43
4.2 Experimental Results of Representation Regularization . . . . . . . . 46
4.3 Scaling, Permutation, Covariance, and Correlation . . . . . . . . . . . 48
4.3.1 Identical Output Network (ION) . . . . . . . . . . . . . . . . 48
4.3.2 Possible Extensions for ION . . . . . . . . . . . . . . . . . . 51
4.4 Sparsity, Dead Unit, and Rank . . . . . . . . . . . . . . . . . . . . . 55
4.4.1 Analytical Relationship . . . . . . . . . . . . . . . . . . . . . 55
4.4.2 Rank Regularizer . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.3 A Controlled Experiment on Data Generation Process . . . . 58
4.5 Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Chapter 5. Practical Ways of Using Representation Regularizers 65
5.1 Tuning Deep Network Performance Using Representation Regularizers 65
5.1.1 Experimental Settings and Conditions . . . . . . . . . . . . . 66
5.1.2 Consistently Well-performing Regularizer . . . . . . . . . . . 67
5.1.3 Performance Improvement Using Regularizers as a Set . . . . 68
5.2 Enhancing Network Compression Using Representation Regularizers 68
5.2.1 The Need for Network Compression . . . . . . . . . . . . . . 72
5.2.2 Three Typical Approaches for Network Compression . . . . . 73
5.2.3 Proposed Approaches and Experimental Results . . . . . . . 74
Chapter 6. Discussion 79
6.1 Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.1.1 Usefulness of Class Information . . . . . . . . . . . . . . . . 79
6.1.2 Comparison with Non-penalty Regularizers: Dropout and Batch Normalization . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.1.3 Identical Output Network . . . . . . . . . . . . . . . . . . . 82
6.1.4 Using Representation Regularizers for Performance Tuning . 82
6.1.5 Benefits and Drawbacks of Different Statistical Characteristics of Representations . . . . . . . . . . . . . . . . . . . . . . . 83
6.2 Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.1 Understanding the Underlying Mechanism of Representation Regularization . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.2 Manipulating Representation Characteristics other than Covariance and Variance for ReLU Networks . . . . . . . . . . . . 86
6.2.3 Investigating Representation Characteristics of Complicated Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.3 Possible Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3.1 Interpreting Learned Representations via Visualization . . . . 88
6.3.2 Designing a Regularizer Utilizing Mutual Information . . . . 89
6.3.3 Applying Multiple Representation Regularizers to a Network . 90
6.3.4 Enhancing Deep Network Compression via Representation Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Chapter 7. Conclusion 93
Bibliography 94
Appendix 103
A Principal Component Analysis of Learned Representations . . . . . . 104
B Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Acknowlegement 113Docto
Multi-fidelity deep residual recurrent neural networks for uncertainty quantification
Effective propagation of uncertainty through a nonlinear dynamical system is
an essential task for a number of engineering applications. One viable probabilistic
approach to propagate the uncertainty from the high dimensional random inputs
to the high-fidelity model outputs is Monte Carlo method. However, Monte Carlo
method requires a substantial number of computationally expensive high-fidelity
simulations to converge their computed estimations towards the desired statistics.
Hence, performing Monte Carlo high-fidelity simulations becomes computationally
prohibitive for large-scale realistic problems. Multi-fidelity approaches provide a
general framework for combining a hierarchy of computationally cheap low-fidelity
models to accelerate the Monte Carlo estimation of the high-fidelity model output.
The objective of this thesis is to derive computationally efficient low-fidelity models
and an effective multi-fidelity framework to accelerate the Monte Carlo method that
uses a single high-fidelity model only.
In this thesis, a physics aware recurrent neural network (RNN) called deep residual recurrent neural network (DR-RNN) is developed as an efficient low-fidelity
model for nonlinear dynamical systems. The information hidden in the mathematical model representing the nonlinear dynamical system is exploited to construct the
DR-RNN architecture. The developed DR-RNN is inspired by the iterative steps of
line search methods in finding the residual minimiser of numerically discretized differential equations. More specifically, the stacked layers of the DR-RNN architecture
is formulated to act collectively as an iterative scheme. The dynamics of DR-RNN
is explicit in time with remarkable convergence and stability properties for a large
time step that violates numerical stability condition. Numerical examples demonstrate that DR-RNN can effectively emulate the high-fidelity model of nonlinear
physical systems with a significantly lower number of parameters in comparison to
standard RNN architectures. Further, DR-RNN is combined with Proper Orthogonal Decomposition (POD) for model reduction of time dependent partial differential
equations. The numerical results show the proposed DR-RNN as an explicit and stable reduced order technique. The numerical results also show significant gains in
accuracy by increasing the depth of proposed DR-RNN similar to other applications
of deep learning.
Next, a reduced order modeling technique for subsurface multi-phase flow problems is developed building on the DR-RNN architecture. More specifically, DR-RNN
is combined with POD and discrete empirical interpolation method (DEIM) to reduce the computational complexity associated with high-fidelity subsurface multi-phase flow simulations. In the presented formulation, POD is used to construct
an optimal set of reduced basis functions and DEIM is employed to evaluate the
nonlinear terms independent of the high-fidelity model size. The proposed ROM
is demonstrated on two uncertainty quantification test cases involving Monte Carlo
simulation of subsurface flow with random permeability field. The obtained results
demonstrate that DR-RNN combined with POD-DEIM provides an accurate and
stable ROM with a fixed computational budget that is much less than the computational cost of standard POD-Galerkin ROM combined with DEIM for nonlinear
dynamical systems.
Finally, this thesis focus on developing multi-fidelity framework to estimate the
statistics of high-fidelity model outputs of interest. Recently, Multi-Fidelity Monte
Carlo (MFMC) method and Multi-Level Monte Carlo (MLMC) method have shown
to significantly accelerate the Monte Carlo estimation by making use of low cost
low-fidelity models. In this thesis, the features of both the MFMC method and the
MLMC method are combined into a single framework called Multi-Fidelity-Multi-Level Monte Carlo (MFML-MC) method. In MFML-MC method, MLMC framework is developed first in which a multi-level hierarchy of POD approximations of
high-fidelity outputs are utilized as low-fidelity models. Next, MFMC method is
incorporated into the developed MLMC framework in which the MLMC estimator
is modified at each level to benefit from a level specific low-fidelity model. Finally,
a variant of deep residual recurrent neural network called Model-Free DR-RNN
(MF-DR-RNN) is used as a level specific low-fidelity model in the MFML-MC
framework. The performance of MFML-MC method is compared to Monte Carlo estimation that uses either a high-fidelity model or a single low-fidelity model on
two subsurface flow problems with random permeability field. Numerical results
show that MFML-MC method provides an unbiased estimator and show speedups
by orders of magnitude compared to Monte Carlo estimation that uses a single
high-fidelity model