129 research outputs found
Functional limit laws for the increments of the quantile process; with applications
We establish a functional limit law of the logarithm for the increments of
the normed quantile process based upon a random sample of size . We
extend a limit law obtained by Deheuvels and Mason (12), showing that their
results hold uniformly over the bandwidth , restricted to vary in
, where and are
appropriate non-random sequences. We treat the case where the sample
observations follow possibly non-uniform distributions. As a consequence of our
theorems, we provide uniform limit laws for nearest-neighbor density
estimators, in the spirit of those given by Deheuvels and Mason (13) for
kernel-type estimators.Comment: Published in at http://dx.doi.org/10.1214/07-EJS099 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Regression modeling on stratified data with the lasso
We consider the estimation of regression models on strata defined using a
categorical covariate, in order to identify interactions between this
categorical covariate and the other predictors. A basic approach requires the
choice of a reference stratum. We show that the performance of a penalized
version of this approach depends on this arbitrary choice. We propose a refined
approach that bypasses this arbitrary choice, at almost no additional
computational cost. Regarding model selection consistency, our proposal mimics
the strategy based on an optimal and covariate-specific choice for the
reference stratum. Results from an empirical study confirm that our proposal
generally outperforms the basic approach in the identification and description
of the interactions. An illustration on gene expression data is provided.Comment: 23 pages, 5 figure
On some limitations of probabilistic models for dimension-reduction: illustration in the case of one particular probabilistic formulation of PLS
Partial Least Squares (PLS) refer to a class of dimension-reduction
techniques aiming at the identification of two sets of components with maximal
covariance, in order to model the relationship between two sets of observed
variables and , with .
El Bouhaddani et al. (2017) have recently proposed a probabilistic formulation
of PLS. Under the constraints they consider for the parameters of their model,
this latter can be seen as a probabilistic formulation of one version of PLS,
namely the PLS-SVD. However, we establish that these constraints are too
restrictive as they define a very particular subset of distributions for
under which, roughly speaking, components with maximal covariance
(solutions of PLS-SVD), are also necessarily of respective maximal variances
(solutions of the principal components analyses of and , respectively).
Then, we propose a simple extension of el Bouhaddani et al.'s model, which
corresponds to a more general probabilistic formulation of PLS-SVD, and which
is no longer restricted to these particular distributions. We present numerical
examples to illustrate the limitations of the original model of el Bouhaddani
et al. (2017)
- …