9,304 research outputs found

    Functional Regression

    Full text link
    Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and Silverman called replication and regularization, respectively. This article will focus on functional regression, the area of FDA that has received the most attention in applications and methodological development. First will be an introduction to basis functions, key building blocks for regularization in functional regression methods, followed by an overview of functional regression methods, split into three types: [1] functional predictor regression (scalar-on-function), [2] functional response regression (function-on-scalar) and [3] function-on-function regression. For each, the role of replication and regularization will be discussed and the methodological development described in a roughly chronological manner, at times deviating from the historical timeline to group together similar methods. The primary focus is on modeling and methodology, highlighting the modeling structures that have been developed and the various regularization approaches employed. At the end is a brief discussion describing potential areas of future development in this field

    Statistical inference of the mechanisms driving collective cell movement

    Get PDF
    Numerous biological processes, many impacting on human health, rely on collective cell movement. We develop nine candidate models, based on advection-diffusion partial differential equations, to describe various alternative mechanisms that may drive cell movement. The parameters of these models were inferred from one-dimensional projections of laboratory observations of Dictyostelium discoideum cells by sampling from the posterior distribution using the delayed rejection adaptive Metropolis algorithm (DRAM). The best model was selected using the Widely Applicable Information Criterion (WAIC). We conclude that cell movement in our study system was driven both by a self-generated gradient in an attractant that the cells could deplete locally, and by chemical interactions between the cells

    An efficient polynomial chaos-based proxy model for history matching and uncertainty quantification of complex geological structures

    Get PDF
    A novel polynomial chaos proxy-based history matching and uncertainty quantification method is presented that can be employed for complex geological structures in inverse problems. For complex geological structures, when there are many unknown geological parameters with highly nonlinear correlations, typically more than 106 full reservoir simulation runs might be required to accurately probe the posterior probability space given the production history of reservoir. This is not practical for high-resolution geological models. One solution is to use a "proxy model" that replicates the simulation model for selected input parameters. The main advantage of the polynomial chaos proxy compared to other proxy models and response surfaces is that it is generally applicable and converges systematically as the order of the expansion increases. The Cameron and Martin theorem 2.24 states that the convergence rate of the standard polynomial chaos expansions is exponential for Gaussian random variables. To improve the convergence rate for non-Gaussian random variables, the generalized polynomial chaos is implemented that uses an Askey-scheme to choose the optimal basis for polynomial chaos expansions [199]. Additionally, for the non-Gaussian distributions that can be effectively approximated by a mixture of Gaussian distributions, we use the mixture-modeling based clustering approach where under each cluster the polynomial chaos proxy converges exponentially fast and the overall posterior distribution can be estimated more efficiently using different polynomial chaos proxies. The main disadvantage of the polynomial chaos proxy is that for high-dimensional problems, the number of the polynomial chaos terms increases drastically as the order of the polynomial chaos expansions increases. Although different non-intrusive methods have been developed in the literature to address this issue, still a large number of simulation runs is required to compute high-order terms of the polynomial chaos expansions. This work resolves this issue by proposing the reduced-terms polynomial chaos expansion which preserves only the relevant terms in the polynomial chaos representation. We demonstrated that the sparsity pattern in the polynomial chaos expansion, when used with the Karhunen-Loéve decomposition method or kernel PCA, can be systematically captured. A probabilistic framework based on the polynomial chaos proxy is also suggested in the context of the Bayesian model selection to study the plausibility of different geological interpretations of the sedimentary environments. The proposed surrogate-accelerated Bayesian inverse analysis can be coherently used in practical reservoir optimization workflows and uncertainty assessments

    A machine learning route between band mapping and band structure

    Get PDF
    The electronic band structure (BS) of solid state materials imprints the multidimensional and multi-valued functional relations between energy and momenta of periodically confined electrons. Photoemission spectroscopy is a powerful tool for its comprehensive characterization. A common task in photoemission band mapping is to recover the underlying quasiparticle dispersion, which we call band structure reconstruction. Traditional methods often focus on specific regions of interests yet require extensive human oversight. To cope with the growing size and scale of photoemission data, we develop a generic machine-learning approach leveraging the information within electronic structure calculations for this task. We demonstrate its capability by reconstructing all fourteen valence bands of tungsten diselenide and validate the accuracy on various synthetic data. The reconstruction uncovers previously inaccessible momentum-space structural information on both global and local scales in conjunction with theory, while realizing a path towards integrating band mapping data into materials science databases
    corecore