7,047 research outputs found
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
Statistical Estimation for Covariance Structures with Tail Estimates using Nodewise Quantile Predictive Regression Models
This paper considers the specification of covariance structures with tail
estimates. We focus on two aspects: (i) the estimation of the VaR-CoVaR risk
matrix in the case of larger number of time series observations than assets in
a portfolio using quantile predictive regression models without assuming the
presence of nonstationary regressors and; (ii) the construction of a novel
variable selection algorithm, so-called, Feature Ordering by Centrality
Exclusion (FOCE), which is based on an assumption-lean regression framework,
has no tuning parameters and is proved to be consistent under general sparsity
assumptions. We illustrate the usefulness of our proposed methodology with
numerical studies of real and simulated datasets when modelling systemic risk
in a network
Singularity Formation in the High-Dimensional Euler Equations and Sampling of High-Dimensional Distributions by Deep Generative Networks
High dimensionality brings both opportunities and challenges to the study of applied mathematics. This thesis consists of two parts. The first part explores the singularity formation of the axisymmetric incompressible Euler equations with no swirl in ℝⁿ, which is closely related to the Millennium Prize Problem on the global singularity of the Navier-Stokes equations. In this part, the high dimensionality contributes to the singularity formation in finite time by enhancing the strength of the vortex stretching term. The second part focuses on sampling from a high-dimensional distribution using deep generative networks, which has wide applications in the Bayesian inverse problem and the image synthesis task. The high dimensionality in this part becomes a significant challenge to the numerical algorithms, known as the curse of dimensionality.
In the first part of this thesis, we consider the singularity formation in two scenarios. In the first scenario, for the axisymmetric Euler equations with no swirl, we consider the case when the initial condition for the angular vorticity is Cα Hölder continuous. We provide convincing numerical examples where the solutions develop potential self-similar blow-up in finite time when the Hölder exponent α < α*, and this upper bound α* can asymptotically approach 1 - 2/n. This result supports a conjecture from Drivas and Elgindi [37], and generalizes it to the high-dimensional case. This potential blow-up is insensitive to the perturbation of initial data. Based on assumptions summarized from numerical experiments, we study a limiting case of the Euler equations, and obtain α* = 1 - 2/n which agrees with the numerical result. For the general case, we propose a relatively simple one-dimensional model and numerically verify its approximation to the Euler equations. This one-dimensional model might suggest a possible way to show this finite-time blow-up scenario analytically. Compared to the first proved blow-up result of the 3D axisymmetric Euler equations with no swirl and Hölder continuous initial data by Elgindi in [40], our potential blow-up scenario has completely different scaling behavior and regularity of the initial condition. In the second scenario, we consider using smooth initial data, but modify the Euler equations by adding a factor ε as the coefficient of the convection terms to weaken the convection effect. The new model is called the weak convection model. We provide convincing numerical examples of the weak convection model where the solutions develop potential self-similar blow-up in finite time when the convection strength ε < ε*, and this upper bound ε* should be close to 1 - 2/n. This result is closely related to the infinite-dimensional case of an open question [37] stated by Drivas and Elgindi. Our numerical observations also inspire us to approximate the weak convection model with a one-dimensional model. We give a rigorous proof that the one-dimensional model will develop finite-time blow-up if ε < 1 - 2/n, and study the approximation quality of the one-dimensional model to the weak convection model numerically, which could be beneficial to a rigorous proof of the potential finite-time blow-up.
In the second part of the thesis, we propose the Multiscale Invertible Generative Network (MsIGN) to sample from high-dimensional distributions by exploring the low-dimensional structure in the target distribution. The MsIGN models a transport map from a known reference distribution to the target distribution, and thus is very efficient in generating uncorrelated samples compared to MCMC-type methods. The MsIGN captures multiple modes in the target distribution by generating new samples hierarchically from a coarse scale to a fine scale with the help of a novel prior conditioning layer. The hierarchical structure of the MsIGN also allows training in a coarse-to-fine scale manner. The Jeffreys divergence is used as the objective function in training to avoid mode collapse. Importance sampling based on the prior conditioning layer is leveraged to estimate the Jeffreys divergence, which is intractable in previous deep generative networks. Numerically, when applied to two Bayesian inverse problems, the MsIGN clearly captures multiple modes in the high-dimensional posterior and approximates the posterior accurately, demonstrating its superior performance compared with previous methods. We also provide an ablation study to show the necessity of our proposed network architecture and training algorithm for the good numerical performance. Moreover, we also apply the MsIGN to the image synthesis task, where it achieves superior performance in terms of bits-per-dimension value over other flow-based generative models and yields very good interpretability of its neurons in intermediate layers.</p
New estimation methods for extremal bivariate return curves
In the multivariate setting, estimates of extremal risk measures are important in many contexts, such as environmental planning and structural engineering. In this paper, we propose new estimation methods for extremal bivariate return curves, a risk measure that is the natural bivariate extension to a return level. Unlike several existing techniques, our estimates are based on bivariate extreme value models that can capture both key forms of extremal dependence. We devise tools for validating return curve estimates, as well as representing their uncertainty, and compare a selection of curve estimation techniques through simulation studies. We apply the methodology to two met-ocean data sets, with diagnostics indicating generally good performance
Contrastive Learning for Unsupervised Domain Adaptation of Time Series
Unsupervised domain adaptation (UDA) aims at learning a machine learning
model using a labeled source domain that performs well on a similar yet
different, unlabeled target domain. UDA is important in many applications such
as medicine, where it is used to adapt risk scores across different patient
cohorts. In this paper, we develop a novel framework for UDA of time series
data, called CLUDA. Specifically, we propose a contrastive learning framework
to learn contextual representations in multivariate time series, so that these
preserve label information for the prediction task. In our framework, we
further capture the variation in the contextual representations between source
and target domain via a custom nearest-neighbor contrastive learning. To the
best of our knowledge, ours is the first framework to learn domain-invariant,
contextual representation for UDA of time series data. We evaluate our
framework using a wide range of time series datasets to demonstrate its
effectiveness and show that it achieves state-of-the-art performance for time
series UDA.Comment: Published as a conference paper at ICLR 202
Causal effects of green infrastructure on stormwater hydrology and water quality
Applications of green infrastructure to stormwater management continue to increase in urban landscapes. There are numerous studies of individual stormwater management sites, but few meta-analyses that synthesize and explore design variables for stormwater control structures within a robust statistical framework. The lack of a standardized framework is due to the complexity of stormwater infrastructure designs. Locally customized designs fit to meet diverse site conditions create datasets that become messy, non-uniform, and difficult to analyze across multiple sites. In this dissertation, I first examine how hydrologic processes govern the function of various stormwater infrastructure technologies using water budget data from published literature. The hydrologic observations are displayed on a Water Budget Triangle---a ternary plot tool developed to visualize simplified water budgets---to enable direct functional comparisons of green and grey approaches to stormwater management. The findings are used to generate a suite of observable site characteristics, which are then mapped to a set of stormwater control and treatment sites reported in the International Stormwater Best Management Practice (BMP) database. These mapped site characteristics provide site context for the runoff and water quality observations present in the database. Drawing from these contextual observations of design variables, I next examine the functional design of different stormwater management technologies by quantifying the differences among varied structural features, and comparing their causal effects on hydrologic and water quality performance. This stormwater toolbox provides a framework for comparison of the overall performance of different system types to understand causal implications of stormwater design
Forward uncertainty quantification with special emphasis on a Bayesian active learning perspective
Uncertainty quantification (UQ) in its broadest sense aims at quantitatively studying all sources of uncertainty arising from both computational and real-world applications. Although many subtopics appear in the UQ field, there are typically two major types of UQ problems: forward and inverse uncertainty propagation. The present study focuses on the former, which involves assessing the effects of the input uncertainty in various forms on the output response of a computational model. In total, this thesis reports nine main developments in the context of forward uncertainty propagation, with special emphasis on a Bayesian active learning perspective.
The first development is concerned with estimating the extreme value distribution and small first-passage probabilities of uncertain nonlinear structures under stochastic seismic excitations, where a moment-generating function-based mixture distribution approach (MGF-MD) is proposed. As the second development, a triple-engine parallel Bayesian global optimization (T-PBGO) method is presented for interval uncertainty propagation. The third contribution develops a parallel Bayesian quadrature optimization (PBQO) method for estimating the response expectation function, its variable importance and bounds when a computational model is subject to hybrid uncertainties in the form of random variables, parametric probability boxes (p-boxes) and interval models. In the fourth research, of interest is the failure probability function when the inputs of a performance function are characterized by parametric p-boxes. To do so, an active learning augmented probabilistic integration (ALAPI) method is proposed based on offering a partially Bayesian active learning perspective on failure probability estimation, as well as the use of high-dimensional model representation (HDMR) technique. Note that in this work we derive an upper-bound of the posterior variance of the failure probability, which bounds our epistemic uncertainty about the failure probability due to a kind of numerical uncertainty, i.e., discretization error. The fifth contribution further strengthens the previously developed active learning probabilistic integration (ALPI) method in two ways, i.e., enabling the use of parallel computing and enhancing the capability of assessing small failure probabilities. The resulting method is called parallel adaptive Bayesian quadrature (PABQ). The sixth research presents a principled Bayesian failure probability inference (BFPI) framework, where the posterior variance of the failure probability is derived (not in closed form). Besides, we also develop a parallel adaptive-Bayesian failure probability learning (PA-BFPI) method upon the BFPI framework. For the seventh development, we propose a partially Bayesian active learning line sampling (PBAL-LS) method for assessing extremely small failure probabilities, where a partially Bayesian active learning insight is offered for the classical LS method and an upper-bound for the posterior variance of the failure probability is deduced. Following the PBAL-LS method, the eighth contribution finally obtains the expression of the posterior variance of the failure probability in the LS framework, and a Bayesian active learning line sampling (BALLS) method is put forward. The ninth contribution provides another Bayesian active learning alternative, Bayesian active learning line sampling with log-normal process (BAL-LS-LP), to the traditional LS. In this method, the log-normal process prior, instead of a Gaussian process prior, is assumed for the beta function so as to account for the non-negativity constraint. Besides, the approximation error resulting from the root-finding procedure is also taken into consideration.
In conclusion, this thesis presents a set of novel computational methods for forward UQ, especially from a Bayesian active learning perspective. The developed methods are expected to enrich our toolbox for forward UQ analysis, and the insights gained can stimulate further studies
A hierarchical Bayesian non-asymptotic extreme value model for spatial data
Spatial maps of extreme precipitation are crucial in flood protection. With
the aim of producing maps of precipitation return levels, we propose a novel
approach to model a collection of spatially distributed time series where the
asymptotic assumption, typical of the traditional extreme value theory, is
relaxed. We introduce a Bayesian hierarchical model that accounts for the
possible underlying variability in the distribution of event magnitudes and
occurrences, which are described through latent temporal and spatial processes.
Spatial dependence is characterized by geographical covariates and effects not
fully described by the covariates are captured by spatial structure in the
hierarchies. The performance of the approach is illustrated through simulation
studies and an application to daily rainfall extremes across North Carolina
(USA). The results show that we significantly reduce the estimation uncertainty
with respect to state of the art techniques
Flow Away your Differences: Conditional Normalizing Flows as an Improvement to Reweighting
We present an alternative to reweighting techniques for modifying
distributions to account for a desired change in an underlying conditional
distribution, as is often needed to correct for mis-modelling in a simulated
sample. We employ conditional normalizing flows to learn the full conditional
probability distribution from which we sample new events for conditional values
drawn from the target distribution to produce the desired, altered
distribution. In contrast to common reweighting techniques, this procedure is
independent of binning choice and does not rely on an estimate of the density
ratio between two distributions.
In several toy examples we show that normalizing flows outperform reweighting
approaches to match the distribution of the target.We demonstrate that the
corrected distribution closes well with the ground truth, and a statistical
uncertainty on the training dataset can be ascertained with bootstrapping. In
our examples, this leads to a statistical precision up to three times greater
than using reweighting techniques with identical sample sizes for the source
and target distributions. We also explore an application in the context of high
energy particle physics.Comment: 21 pages, 9 figure
Recommended from our members
Rare-Event Estimation and Calibration for Large-Scale Stochastic Simulation Models
Stochastic simulation has been widely applied in many domains. More recently, however, the rapid surge of sophisticated problems such as safety evaluation of intelligent systems has posed various challenges to conventional statistical methods. Motivated by these challenges, in this thesis, we develop novel methodologies with theoretical guarantees and numerical applications to tackle them from different perspectives.
In particular, our works can be categorized into two areas: (1) rare-event estimation (Chapters 2 to 5) where we develop approaches to estimating the probabilities of rare events via simulation; (2) model calibration (Chapters 6 and 7) where we aim at calibrating the simulation model so that it is close to reality.
In Chapter 2, we study rare-event simulation for a class of problems where the target hitting sets of interest are defined via modern machine learning tools such as neural networks and random forests. We investigate an importance sampling scheme that integrates the dominating point machinery in large deviations and sequential mixed integer programming to locate the underlying dominating points. We provide efficiency guarantees and numerical demonstration of our approach.
In Chapter 3, we propose a new efficiency criterion for importance sampling, which we call probabilistic efficiency. Conventionally, an estimator is regarded as efficient if its relative error is sufficiently controlled. It is widely known that when a rare-event set contains multiple "important regions" encoded by the dominating points, importance sampling needs to account for all of them via mixing to achieve efficiency. We argue that the traditional analysis recipe could suffer from intrinsic looseness by using relative error as an efficiency criterion. Thus, we propose the new efficiency notion to tighten this gap. In particular, we show that under the standard Gartner-Ellis large deviations regime, an importance sampling that uses only the most significant dominating points is sufficient to attain this efficiency notion.
In Chapter 4, we consider the estimation of rare-event probabilities using sample proportions output by crude Monte Carlo. Due to the recent surge of sophisticated rare-event problems, efficiency-guaranteed variance reduction may face implementation challenges, which motivate one to look at naive estimators. In this chapter we construct confidence intervals for the target probability using this naive estimator from various techniques, and then analyze their validity as well as tightness respectively quantified by the coverage probability and relative half-width.
In Chapter 5, we propose the use of extreme value analysis, in particular the peak-over-threshold method which is popularly employed for extremal estimation of real datasets, in the simulation setting. More specifically, we view crude Monte Carlo samples as data to fit on a generalized Pareto distribution. We test this idea on several numerical examples. The results show that in the absence of efficient variance reduction schemes, it appears to offer potential benefits to enhance crude Monte Carlo estimates.
In Chapter 6, we investigate a framework to develop calibration schemes in parametric settings, which satisfies rigorous frequentist statistical guarantees via a basic notion that we call eligibility set designed to bypass non-identifiability via a set-based estimation. We investigate a feature extraction-then-aggregation approach to construct these sets that target at multivariate outputs. We demonstrate our methodology on several numerical examples, including an application to calibration of a limit order book market simulator.
In Chapter 7, we study a methodology to tackle the NASA Langley Uncertainty Quantification Challenge, a model calibration problem under both aleatory and epistemic uncertainties. Our methodology is based on an integration of distributionally robust optimization and importance sampling. The main computation machinery in this integrated methodology amounts to solving sampled linear programs. We present theoretical statistical guarantees of our approach via connections to nonparametric hypothesis testing, and numerical performances including parameter calibration and downstream decision and risk evaluation tasks
- …