171 research outputs found
Statistical inference in mechanistic models: time warping for improved gradient matching
Inference in mechanistic models of non-linear differential equations is a challenging problem in current computational statistics. Due to the high computational costs of numerically solving the differential equations in every step of an iterative parameter adaptation scheme, approximate methods based on gradient matching have become popular. However, these methods critically depend on the smoothing scheme for function interpolation. The present article adapts an idea from manifold learning and demonstrates that a time warping approach aiming to homogenize intrinsic length scales can lead to a significant improvement in parameter estimation accuracy. We demonstrate the effectiveness of this scheme on noisy data from two dynamical systems with periodic limit cycle, a biopathway, and an application from soft-tissue mechanics. Our study also provides a comparative evaluation on a wide range of signal-to-noise ratios
Penalized Likelihood and Bayesian Function Selection in Regression Models
Challenging research in various fields has driven a wide range of
methodological advances in variable selection for regression models with
high-dimensional predictors. In comparison, selection of nonlinear functions in
models with additive predictors has been considered only more recently. Several
competing suggestions have been developed at about the same time and often do
not refer to each other. This article provides a state-of-the-art review on
function selection, focusing on penalized likelihood and Bayesian concepts,
relating various approaches to each other in a unified framework. In an
empirical comparison, also including boosting, we evaluate several methods
through applications to simulated and real data, thereby providing some
guidance on their performance in practice
Augmented balancing weights as linear regression
We provide a novel characterization of augmented balancing weights, also
known as Automatic Debiased Machine Learning (AutoDML). These estimators
combine outcome modeling with balancing weights, which estimate inverse
propensity score weights directly. When the outcome and weighting models are
both linear in some (possibly infinite) basis, we show that the augmented
estimator is equivalent to a single linear model with coefficients that combine
the original outcome model coefficients and OLS; in many settings, the
augmented estimator collapses to OLS alone. We then extend these results to
specific choices of outcome and weighting models. We first show that the
combined estimator that uses (kernel) ridge regression for both outcome and
weighting models is equivalent to a single, undersmoothed (kernel) ridge
regression; this also holds when considering asymptotic rates. When the
weighting model is instead lasso regression, we give closed-form expressions
for special cases and demonstrate a ``double selection'' property. Finally, we
generalize these results to linear estimands via the Riesz representer. Our
framework ``opens the black box'' on these increasingly popular estimators and
provides important insights into estimation choices for augmented balancing
weights
A Partially Linear Framework for Massive Heterogeneous Data
We consider a partially linear framework for modelling massive heterogeneous
data. The major goal is to extract common features across all sub-populations
while exploring heterogeneity of each sub-population. In particular, we propose
an aggregation type estimator for the commonality parameter that possesses the
(non-asymptotic) minimax optimal bound and asymptotic distribution as if there
were no heterogeneity. This oracular result holds when the number of
sub-populations does not grow too fast. A plug-in estimator for the
heterogeneity parameter is further constructed, and shown to possess the
asymptotic distribution as if the commonality information were available. We
also test the heterogeneity among a large number of sub-populations. All the
above results require to regularize each sub-estimation as though it had the
entire sample size. Our general theory applies to the divide-and-conquer
approach that is often used to deal with massive homogeneous data. A technical
by-product of this paper is the statistical inferences for the general kernel
ridge regression. Thorough numerical results are also provided to back up our
theory.Comment: 40 pages main text + 40 pages suppl, To appear in Annals of
Statistic
Non-asymptotic Optimal Prediction Error for RKHS-based Partially Functional Linear Models
Under the framework of reproducing kernel Hilbert space (RKHS), we consider
the penalized least-squares of the partially functional linear models (PFLM),
whose predictor contains both functional and traditional multivariate part, and
the multivariate part allows a divergent number of parameters. From the
non-asymptotic point of view, we focus on the rate-optimal upper and lower
bounds of the prediction error. An exact upper bound for the excess prediction
risk is shown in a non-asymptotic form under a more general assumption known as
the effective dimension to the model, by which we also show the prediction
consistency when the number of multivariate covariates slightly increases
with the sample size . Our new finding implies a trade-off between the
number of non-functional predictors and the effective dimension of the kernel
principal components to ensure the prediction consistency in the
increasing-dimensional setting. The analysis in our proof hinges on the
spectral condition of the sandwich operator of the covariance operator and the
reproducing kernel, and on the concentration inequalities for the random
elements in Hilbert space. Finally, we derive the non-asymptotic minimax lower
bound under the regularity assumption of Kullback-Leibler divergence of the
models.Comment: 24 page
Explicit connections between longitudinal data analysis and kernel machines
Two areas of research – longitudinal data analysis and kernel machines – have large, but mostly distinct, literatures. This article shows explicitly that both fields have much in common with each other. In particular, many popular longitudinal data fitting procedures are special types of kernel machines. These connections have the potential to provide fruitful cross-fertilization between longitudinal data analytic and kernel machine methodology. © 2009, Institute of Mathematical Statistics. All rights reserved
- …