77 research outputs found

    Fast and Exact Leave-One-Out Analysis of Large-Margin Classifiers

    No full text
    Motivated by the Golub-Heath-Wahba formula for ridge regression, we first present a new leave-one-out lemma for the kernel support vector machines (SVM) and related large-margin classifiers. We then use the lemma to design a novel and efficient algorithm, named “magicsvm”, for training the kernel SVM and related large-margin classifiers and computing the exact leave-one-out cross-validation error. By “magicsvm”, the computational cost of leave-one-out analysis is of the same order of fitting a single SVM on the training data. We show that “magicsvm” is much faster than the state-of-the-art SVM solvers based on extensive simulations and benchmark examples. The same idea is also used to boost the computation speed of the V-fold cross-validation of the kernel classifiers. Supplementary materials, including the technical proofs and links to the datasets used in this paper, are available online.</p

    High-Dimensional Censored Regression via the Penalized Tobit Likelihood

    No full text
    High-dimensional regression and regression with a left-censored response are each well-studied topics. In spite of this, few methods have been proposed which deal with both of these complications simultaneously. The Tobit model—long the standard method for censored regression in economics—has not been adapted for high-dimensional regression at all. To fill this gap and bring up-to-date techniques from high-dimensional statistics to the field of high-dimensional left-censored regression, we propose several penalized Tobit models. We develop a fast algorithm which combines quadratic majorization with coordinate descent to compute the penalized Tobit solution path. Theoretically, we analyze the Tobit lasso and Tobit with a folded concave penalty, bounding the l2 estimation loss for the former and proving that a local linear approximation estimator for the latter possesses the strong oracle property. Through an extensive simulation study, we find that our penalized Tobit models provide more accurate predictions and parameter estimates than other methods on high-dimensional left-censored data. We use a penalized Tobit model to analyze high-dimensional left-censored HIV viral load data from the AIDS Clinical Trials Group and identify potential drug resistance mutations in the HIV genome. A supplementary file contains intermediate theoretical results and technical proofs.</p

    Cross-fitted Residual Regression for High Dimensional Heteroscedasticity Pursuit

    No full text
    There is a vast amount of work on high dimensional regression. The common starting point for the existing theoretical work is to assume the data generating model is a homoscedastic linear regression model with some sparsity structure. In reality the homoscedasticity assumption is often violated, and hence understanding the heteroscedasticity of the data is of critical importance. In this paper we systematically study the estimation of a high dimensional heteroscedastic regression model. In particular, the emphasis is on how to detect and estimate the heteroscedasticity effects reliably and efficiently. To this end, we propose a cross-fitted residual regression approach and prove the resulting estimator is selection consistent for heteroscedasticity effects and establish its rates of convergence. Our estimator has tuning parameters to be determined by the data in practice. We propose a novel high dimensional BIC for tuning parameter selection and establish its consistency. This is the first high dimensional BIC result under heteroscedasticity. The theoretical analysis is more involved in order to handle heteroscedasticity, and we develop a couple of interesting new concentration inequalities that are of independent interests. Supplementary materials for this article are available online.</p

    A Note on Cross-Validation for Lasso Under Measurement Errors

    No full text
    Variants of the Lasso orℓ1-penalized regression have been proposed to accommodate for presence of measurement errors in the covariates. Theoretical guarantees of these estimates have been established for some oracle values of the regularization parameters which are not known in practice. Data-driven tuning such as cross-validation has not been studied when covariates contain measurement errors. We demonstrate that in the presence of error-in-covariates, even when using a Lasso-variant that adjusts for measurement error, application of naive leave-one-out cross-validation to select the tuning parameter can be problematic. We provide an example where such a practice leads to estimation inconsistency. We also prove that a simple correction to cross-validation procedure restores consistency. We also study the risk consistency of the two cross-validation procedures and offer guideline on the choice of cross-validation based on the measurement error distributions of the training and the prediction data. The theoretical findings are validated using simulated data. Supplementary materials for this article are available online.</p

    Nonparametric Variable Transformation in Sufficient Dimension Reduction

    No full text
    <p>Sufficient dimension reduction (SDR) techniques have proven to be very useful data analysis tools in various applications. Underlying many SDR techniques is a critical assumption that the predictors are elliptically contoured. When this assumption appears to be wrong, practitioners usually try variable transformation such that the transformed predictors become (nearly) normal. The transformation function is often chosen from the log and power transformation family, as suggested in the celebrated Box–Cox model. However, any parametric transformation can be too restrictive, causing the danger of model misspecification. We suggest a nonparametric variable transformation method after which the predictors become normal. To demonstrate the main idea, we combine this flexible transformation method with two well-established SDR techniques, sliced inverse regression (SIR) and inverse regression estimator (IRE). The resulting SDR techniques are referred to as TSIR and TIRE, respectively. Both simulation and real data results show that TSIR and TIRE have very competitive performance. Asymptotic theory is established to support the proposed method. The technical proofs are available as supplementary materials.</p

    Sparse Convoluted Rank Regression in High Dimensions

    No full text
    Wang et al. (2020, JASA) studied the high-dimensional sparse penalized rank regression and established its nice theoretical properties. Compared with the least squares, rank regression can have a substantial gain in estimation efficiency while maintaining a minimal relative efficiency of 86.4%. However, the computation of penalized rank regression can be very challenging for high-dimensional data, due to the highly nonsmooth rank regression loss. In this work we view the rank regression loss as a non-smooth empirical counterpart of a population level quantity, and a smooth empirical counterpart is derived by substituting a kernel density estimator for the true distribution in the expectation calculation. This view leads to the convoluted rank regression loss and consequently the sparse penalized convoluted rank regression (CRR) for high-dimensional data. We prove some interesting asymptotic properties of CRR. Under the same key assumptions for sparse rank regression, we establish the rate of convergence of the l1-penalized CRR for a tuning free penalization parameter and prove the strong oracle property of the folded concave penalized CRR. We further propose a high-dimensional Bayesian information criterion for selecting the penalization parameter in folded concave penalized CRR and prove its selection consistency. We derive an efficient algorithm for solving sparse convoluted rank regression that scales well with high dimensions. Numerical examples demonstrate the promising performance of the sparse convoluted rank regression over the sparse rank regression. Our theoretical and numerical results suggest that sparse convoluted rank regression enjoys the best of both sparse least squares regression and sparse rank regression.</p

    Tweedie’s Compound Poisson Model With Grouped Elastic Net

    No full text
    <p>Tweedie’s compound Poisson model is a popular method to model data with probability mass at zero and nonnegative, highly right-skewed distribution. Motivated by wide applications of the Tweedie model in various fields such as actuarial science, we investigate the grouped elastic net method for the Tweedie model in the context of the generalized linear model. To efficiently compute the estimation coefficients, we devise a two-layer algorithm that embeds the blockwise majorization descent method into an iteratively reweighted least square strategy. Integrated with the strong rule, the proposed algorithm is implemented in an easy-to-use R package HDtweedie, and is shown to compute the whole solution path very efficiently. Simulations are conducted to study the variable selection and model fitting performance of various lasso methods for the Tweedie model. The modeling applications in risk segmentation of insurance business are illustrated by analysis of an auto insurance claim dataset. Supplementary materials for this article are available online.</p

    Coordinatewise Gaussianization: Theories and Applications

    No full text
    In statistical analysis, researchers often perform coordinatewise Gaussianization such that each variable is marginally normal. The normal score transformation is a method for coordinatewise Gaussianization and is widely used in statistics, econometrics, genetics and other areas. However, few studies exist on the theoretical properties of the normal score transformation, especially in high-dimensional problems where the dimension p diverges with the sample size n. In this article, we show that the normal score transformation uniformly converges to its population counterpart even when log p=o(n/ log n). Our result can justify the normal score transformation prior to any downstream statistical method to which the theoretical normal transformation is beneficial. The same results are established for the Winsorized normal transformation, another popular choice for coordinatewise Gaussianization. We demonstrate the benefits of coordinatewise Gaussianization by studying its applications to the Gaussian copula model, the nearest shrunken centroids classifier and distance correlation. The benefits are clearly shown in theory and supported by numerical studies. Moreover, we also point out scenarios where coordinatewise Gaussinization does not help and even causes damages. We offer a general recommendation on how to use coordinatewise Gaussianization in applications. Supplementary materials for this article are available online.</p

    Enveloped Huber Regression

    No full text
    Huber regression (HR) is a popular flexible alternative to the least squares regression when the error follows a heavy-tailed distribution. We propose a new method called the enveloped Huber regression (EHR) by considering the envelope assumption that there exists some subspace of the predictors that has no association with the response, which is referred to as the immaterial part. More efficient estimation is achieved via the removal of the immaterial part. Different from the envelope least squares (ENV) model whose estimation is based on maximum normal likelihood, the estimation of the EHR model is through Generalized Method of Moments. The asymptotic normality of the EHR estimator is established, and it is shown that EHR is more efficient than HR. Moreover, EHR is more efficient than ENV when the error distribution is heavy-tailed, while maintaining a small efficiency loss when the error distribution is normal. Moreover, our theory also covers the heteroscedastic case in which the error may depend on the covariates. The envelope dimension in EHR is a tuning parameter to be determined by the data in practice. We further propose a novel generalized information criterion (GIC) for dimension selection and establish its consistency. Extensive simulation studies confirm the messages from our theory. EHR is further illustrated on a real dataset.</p
    corecore