9,304 research outputs found
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
Statistical inference of the mechanisms driving collective cell movement
Numerous biological processes, many impacting on human health, rely on collective cell
movement. We develop nine candidate models, based on advection-diffusion partial differential equations, to describe various alternative mechanisms that may drive cell movement. The parameters of these models were inferred from one-dimensional projections of laboratory observations of Dictyostelium discoideum cells by sampling from the posterior distribution using the delayed rejection adaptive Metropolis algorithm (DRAM). The best model was selected using the Widely Applicable Information Criterion (WAIC). We conclude that cell movement in our study system was driven both by a self-generated gradient in an attractant that the cells could deplete locally, and by chemical interactions between the cells
An efficient polynomial chaos-based proxy model for history matching and uncertainty quantification of complex geological structures
A novel polynomial chaos proxy-based history matching and uncertainty quantification
method is presented that can be employed for complex geological structures in inverse
problems. For complex geological structures, when there are many unknown geological
parameters with highly nonlinear correlations, typically more than 106 full reservoir
simulation runs might be required to accurately probe the posterior probability space
given the production history of reservoir. This is not practical for high-resolution geological
models. One solution is to use a "proxy model" that replicates the simulation
model for selected input parameters. The main advantage of the polynomial chaos
proxy compared to other proxy models and response surfaces is that it is generally
applicable and converges systematically as the order of the expansion increases. The
Cameron and Martin theorem 2.24 states that the convergence rate of the standard
polynomial chaos expansions is exponential for Gaussian random variables. To improve
the convergence rate for non-Gaussian random variables, the generalized polynomial
chaos is implemented that uses an Askey-scheme to choose the optimal basis for polynomial
chaos expansions [199]. Additionally, for the non-Gaussian distributions that
can be effectively approximated by a mixture of Gaussian distributions, we use the
mixture-modeling based clustering approach where under each cluster the polynomial
chaos proxy converges exponentially fast and the overall posterior distribution can be
estimated more efficiently using different polynomial chaos proxies.
The main disadvantage of the polynomial chaos proxy is that for high-dimensional problems,
the number of the polynomial chaos terms increases drastically as the order of the
polynomial chaos expansions increases. Although different non-intrusive methods have
been developed in the literature to address this issue, still a large number of simulation
runs is required to compute high-order terms of the polynomial chaos expansions. This
work resolves this issue by proposing the reduced-terms polynomial chaos expansion
which preserves only the relevant terms in the polynomial chaos representation. We
demonstrated that the sparsity pattern in the polynomial chaos expansion, when used
with the Karhunen-Loéve decomposition method or kernel PCA, can be systematically
captured.
A probabilistic framework based on the polynomial chaos proxy is also suggested in the
context of the Bayesian model selection to study the plausibility of different geological
interpretations of the sedimentary environments. The proposed surrogate-accelerated
Bayesian inverse analysis can be coherently used in practical reservoir optimization
workflows and uncertainty assessments
A machine learning route between band mapping and band structure
The electronic band structure (BS) of solid state materials imprints the
multidimensional and multi-valued functional relations between energy and
momenta of periodically confined electrons. Photoemission spectroscopy is a
powerful tool for its comprehensive characterization. A common task in
photoemission band mapping is to recover the underlying quasiparticle
dispersion, which we call band structure reconstruction. Traditional methods
often focus on specific regions of interests yet require extensive human
oversight. To cope with the growing size and scale of photoemission data, we
develop a generic machine-learning approach leveraging the information within
electronic structure calculations for this task. We demonstrate its capability
by reconstructing all fourteen valence bands of tungsten diselenide and
validate the accuracy on various synthetic data. The reconstruction uncovers
previously inaccessible momentum-space structural information on both global
and local scales in conjunction with theory, while realizing a path towards
integrating band mapping data into materials science databases
- …