501 research outputs found
Incorporating statistical model error into the calculation of acceptability prices of contingent claims
The determination of acceptability prices of contingent claims requires the
choice of a stochastic model for the underlying asset price dynamics. Given
this model, optimal bid and ask prices can be found by stochastic optimization.
However, the model for the underlying asset price process is typically based on
data and found by a statistical estimation procedure. We define a confidence
set of possible estimated models by a nonparametric neighborhood of a baseline
model. This neighborhood serves as ambiguity set for a multi-stage stochastic
optimization problem under model uncertainty. We obtain distributionally robust
solutions of the acceptability pricing problem and derive the dual problem
formulation. Moreover, we prove a general large deviations result for the
nested distance, which allows to relate the bid and ask prices under model
ambiguity to the quality of the observed data.Comment: 27 pages, 2 figure
A Recursive Algorithm for Mixture of Densities Estimation
In the framework of the so-called extended linear sigma model (eLSM), we include a pseudoscalar glueball with a mass of 2.6 GeV (as predicted by Lattice-QCD simulations) and we compute the two- and three-body decays into scalar and pseudoscalar mesons. This study is relevant for the future PANDA experiment at the FAIR facility. As a second step, we extend the eLSM by including the charm quark according to the global U(4)R × U(4)L chiral symmetry. We compute the masses, weak decay constants and strong decay widths of open charmed mesons. The precise description of the decays of open charmed states is important for the CBM experiment at FAIR
Estimation and Regularization Techniques for Regression Models with Multidimensional Prediction Functions
Boosting is one of the most important methods for fitting
regression models and building prediction rules from
high-dimensional data. A notable feature of boosting is that the
technique has a built-in mechanism for shrinking coefficient
estimates and variable selection. This regularization mechanism
makes boosting a suitable method for analyzing data characterized by
small sample sizes and large numbers of predictors. We extend the
existing methodology by developing a boosting method for prediction
functions with multiple components. Such multidimensional functions
occur in many types of statistical models, for example in count data
models and in models involving outcome variables with a mixture
distribution. As will be demonstrated, the new algorithm is suitable
for both the estimation of the prediction function and
regularization of the estimates. In addition, nuisance parameters
can be estimated simultaneously with the prediction function
Geoadditive Regression Modeling of Stream Biological Condition
Indices of biotic integrity (IBI) have become an established tool to quantify the condition of small non-tidal streams and their watersheds. To investigate the effects of watershed characteristics on stream biological condition, we present a new technique for regressing IBIs on watershed-specific explanatory variables. Since IBIs are typically evaluated on anordinal scale, our method is based on the proportional odds model for ordinal outcomes. To avoid overfitting, we do not use classical maximum likelihood estimation but a component-wise functional gradient boosting approach. Because component-wise gradient boosting has an intrinsic mechanism for variable selection and model choice, determinants of biotic integrity can be identified. In addition, the method offers a relatively simple way to account for spatial correlation in ecological data. An analysis of the Maryland Biological Streams Survey shows that nonlinear effects of predictor variables on stream condition can be quantified while, in addition, accurate predictions of biological condition at unsurveyed locations are obtained
Boosting the concordance index for survival data - a unified framework to derive and evaluate biomarker combinations
The development of molecular signatures for the prediction of time-to-event
outcomes is a methodologically challenging task in bioinformatics and
biostatistics. Although there are numerous approaches for the derivation of
marker combinations and their evaluation, the underlying methodology often
suffers from the problem that different optimization criteria are mixed during
the feature selection, estimation and evaluation steps. This might result in
marker combinations that are only suboptimal regarding the evaluation criterion
of interest. To address this issue, we propose a unified framework to derive
and evaluate biomarker combinations. Our approach is based on the concordance
index for time-to-event data, which is a non-parametric measure to quantify the
discrimatory power of a prediction rule. Specifically, we propose a
component-wise boosting algorithm that results in linear biomarker combinations
that are optimal with respect to a smoothed version of the concordance index.
We investigate the performance of our algorithm in a large-scale simulation
study and in two molecular data sets for the prediction of survival in breast
cancer patients. Our numerical results show that the new approach is not only
methodologically sound but can also lead to a higher discriminatory power than
traditional approaches for the derivation of gene signatures.Comment: revised manuscript - added simulation study, additional result
Sparse image reconstruction for molecular imaging
The application that motivates this paper is molecular imaging at the atomic
level. When discretized at sub-atomic distances, the volume is inherently
sparse. Noiseless measurements from an imaging technology can be modeled by
convolution of the image with the system point spread function (psf). Such is
the case with magnetic resonance force microscopy (MRFM), an emerging
technology where imaging of an individual tobacco mosaic virus was recently
demonstrated with nanometer resolution. We also consider additive white
Gaussian noise (AWGN) in the measurements. Many prior works of sparse
estimators have focused on the case when H has low coherence; however, the
system matrix H in our application is the convolution matrix for the system
psf. A typical convolution matrix has high coherence. The paper therefore does
not assume a low coherence H. A discrete-continuous form of the Laplacian and
atom at zero (LAZE) p.d.f. used by Johnstone and Silverman is formulated, and
two sparse estimators derived by maximizing the joint p.d.f. of the observation
and image conditioned on the hyperparameters. A thresholding rule that
generalizes the hard and soft thresholding rule appears in the course of the
derivation. This so-called hybrid thresholding rule, when used in the iterative
thresholding framework, gives rise to the hybrid estimator, a generalization of
the lasso. Unbiased estimates of the hyperparameters for the lasso and hybrid
estimator are obtained via Stein's unbiased risk estimate (SURE). A numerical
study with a Gaussian psf and two sparse images shows that the hybrid estimator
outperforms the lasso.Comment: 12 pages, 8 figure
Constructing irregular histograms by penalized likelihood
We propose a fully automatic procedure for the construction of irregular histograms. For a given number of bins, the maximum likelihood histogram is known to be the result of a dynamic programming algorithm. To choose the number of bins, we propose two different penalties motivated by recent work in model selection by Castellan [6] and Massart [26]. We give a complete description of the algorithm and a proper tuning of the penalties. Finally, we compare our procedure to other existing proposals for a wide range of different densities and sample sizes. --irregular histogram,density estimation,penalized likelihood,dynamic programming
- …