128 research outputs found

    Functional limit laws for the increments of the quantile process; with applications

    Full text link
    We establish a functional limit law of the logarithm for the increments of the normed quantile process based upon a random sample of size n→∞n\to\infty. We extend a limit law obtained by Deheuvels and Mason (12), showing that their results hold uniformly over the bandwidth hh, restricted to vary in [hn′,hn′′][h'_n,h''_n], where {hn′}n≥1\{h'_n\}_{n\geq1} and {hn′′}n≥1\{h''_n\}_{n\geq 1} are appropriate non-random sequences. We treat the case where the sample observations follow possibly non-uniform distributions. As a consequence of our theorems, we provide uniform limit laws for nearest-neighbor density estimators, in the spirit of those given by Deheuvels and Mason (13) for kernel-type estimators.Comment: Published in at http://dx.doi.org/10.1214/07-EJS099 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Regression modeling on stratified data with the lasso

    Full text link
    We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice. We propose a refined approach that bypasses this arbitrary choice, at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. Results from an empirical study confirm that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.Comment: 23 pages, 5 figure

    On some limitations of probabilistic models for dimension-reduction: illustration in the case of one particular probabilistic formulation of PLS

    Full text link
    Partial Least Squares (PLS) refer to a class of dimension-reduction techniques aiming at the identification of two sets of components with maximal covariance, in order to model the relationship between two sets of observed variables x∈Rpx\in\mathbb{R}^p and y∈Rqy\in\mathbb{R}^q, with p≥1,q≥1p\geq 1, q\geq 1. El Bouhaddani et al. (2017) have recently proposed a probabilistic formulation of PLS. Under the constraints they consider for the parameters of their model, this latter can be seen as a probabilistic formulation of one version of PLS, namely the PLS-SVD. However, we establish that these constraints are too restrictive as they define a very particular subset of distributions for (x,y)(x,y) under which, roughly speaking, components with maximal covariance (solutions of PLS-SVD), are also necessarily of respective maximal variances (solutions of the principal components analyses of xx and yy, respectively). Then, we propose a simple extension of el Bouhaddani et al.'s model, which corresponds to a more general probabilistic formulation of PLS-SVD, and which is no longer restricted to these particular distributions. We present numerical examples to illustrate the limitations of the original model of el Bouhaddani et al. (2017)
    • …
    corecore