1,399 research outputs found

    Non-Concave Penalized Likelihood with NP-Dimensionality

    Full text link
    Penalized likelihood methods are fundamental to ultra-high dimensional variable selection. How high dimensionality such methods can handle remains largely unknown. In this paper, we show that in the context of generalized linear models, such methods possess model selection consistency with oracle properties even for dimensionality of Non-Polynomial (NP) order of sample size, for a class of penalized likelihood approaches using folded-concave penalty functions, which were introduced to ameliorate the bias problems of convex penalty functions. This fills a long-standing gap in the literature where the dimensionality is allowed to grow slowly with the sample size. Our results are also applicable to penalized likelihood with the L1L_1-penalty, which is a convex function at the boundary of the class of folded-concave penalty functions under consideration. The coordinate optimization is implemented for finding the solution paths, whose performance is evaluated by a few simulation examples and the real data analysis.Comment: 37 pages, 2 figure

    Coordinate-independent sparse sufficient dimension reduction and variable selection

    Full text link
    Sufficient dimension reduction (SDR) in regression, which reduces the dimension by replacing original predictors with a minimal set of their linear combinations without loss of information, is very helpful when the number of predictors is large. The standard SDR methods suffer because the estimated linear combinations usually consist of all original predictors, making it difficult to interpret. In this paper, we propose a unified method - coordinate-independent sparse estimation (CISE) - that can simultaneously achieve sparse sufficient dimension reduction and screen out irrelevant and redundant variables efficiently. CISE is subspace oriented in the sense that it incorporates a coordinate-independent penalty term with a broad series of model-based and model-free SDR approaches. This results in a Grassmann manifold optimization problem and a fast algorithm is suggested. Under mild conditions, based on manifold theories and techniques, it can be shown that CISE would perform asymptotically as well as if the true irrelevant predictors were known, which is referred to as the oracle property. Simulation studies and a real-data example demonstrate the effectiveness and efficiency of the proposed approach.Comment: Published in at http://dx.doi.org/10.1214/10-AOS826 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Group variable selection via convex Log-Exp-Sum penalty with application to a breast cancer survivor study

    Full text link
    In many scientific and engineering applications, covariates are naturally grouped. When the group structures are available among covariates, people are usually interested in identifying both important groups and important variables within the selected groups. Among existing successful group variable selection methods, some methods fail to conduct the within group selection. Some methods are able to conduct both group and within group selection, but the corresponding objective functions are non-convex. Such a non-convexity may require extra numerical effort. In this paper, we propose a novel Log-Exp-Sum(LES) penalty for group variable selection. The LES penalty is strictly convex. It can identify important groups as well as select important variables within the group. We develop an efficient group-level coordinate descent algorithm to fit the model. We also derive non-asymptotic error bounds and asymptotic group selection consistency for our method in the high-dimensional setting where the number of covariates can be much larger than the sample size. Numerical results demonstrate the good performance of our method in both variable selection and prediction. We applied the proposed method to an American Cancer Society breast cancer survivor dataset. The findings are clinically meaningful and lead immediately to testable clinical hypotheses

    Sparse and Functional Principal Components Analysis

    Full text link
    Regularized variants of Principal Components Analysis, especially Sparse PCA and Functional PCA, are among the most useful tools for the analysis of complex high-dimensional data. Many examples of massive data, have both sparse and functional (smooth) aspects and may benefit from a regularization scheme that can capture both forms of structure. For example, in neuro-imaging data, the brain's response to a stimulus may be restricted to a discrete region of activation (spatial sparsity), while exhibiting a smooth response within that region. We propose a unified approach to regularized PCA which can induce both sparsity and smoothness in both the row and column principal components. Our framework generalizes much of the previous literature, with sparse, functional, two-way sparse, and two-way functional PCA all being special cases of our approach. Our method permits flexible combinations of sparsity and smoothness that lead to improvements in feature selection and signal recovery, as well as more interpretable PCA factors. We demonstrate the efficacy of our method on simulated data and a neuroimaging example on EEG data.Comment: The published version of this paper incorrectly thanks "Luofeng Luo" instead of "Luofeng Liao" in the Acknowledgement

    Concave Penalized Estimation of Sparse Gaussian Bayesian Networks

    Full text link
    We develop a penalized likelihood estimation framework to estimate the structure of Gaussian Bayesian networks from observational data. In contrast to recent methods which accelerate the learning problem by restricting the search space, our main contribution is a fast algorithm for score-based structure learning which does not restrict the search space in any way and works on high-dimensional datasets with thousands of variables. Our use of concave regularization, as opposed to the more popular β„“0\ell_0 (e.g. BIC) penalty, is new. Moreover, we provide theoretical guarantees which generalize existing asymptotic results when the underlying distribution is Gaussian. Most notably, our framework does not require the existence of a so-called faithful DAG representation, and as a result the theory must handle the inherent nonidentifiability of the estimation problem in a novel way. Finally, as a matter of independent interest, we provide a comprehensive comparison of our approach to several standard structure learning methods using open-source packages developed for the R language. Based on these experiments, we show that our algorithm is significantly faster than other competing methods while obtaining higher sensitivity with comparable false discovery rates for high-dimensional data. In particular, the total runtime for our method to generate a solution path of 20 estimates for DAGs with 8000 nodes is around one hour.Comment: 57 page

    A new scope of penalized empirical likelihood with high-dimensional estimating equations

    Full text link
    Statistical methods with empirical likelihood (EL) are appealing and effective especially in conjunction with estimating equations through which useful data information can be adaptively and flexibly incorporated. It is also known in the literature that EL approaches encounter difficulties when dealing with problems having high-dimensional model parameters and estimating equations. To overcome the challenges, we begin our study with a careful investigation on high-dimensional EL from a new scope targeting at estimating a high-dimensional sparse model parameters. We show that the new scope provides an opportunity for relaxing the stringent requirement on the dimensionality of the model parameter. Motivated by the new scope, we then propose a new penalized EL by applying two penalty functions respectively regularizing the model parameters and the associated Lagrange multipliers in the optimizations of EL. By penalizing the Lagrange multiplier to encourage its sparsity, we show that drastic dimension reduction in the number of estimating equations can be effectively achieved without compromising the validity and consistency of the resulting estimators. Most attractively, such a reduction in dimensionality of estimating equations is actually equivalent to a selection among those high-dimensional estimating equations, resulting in a highly parsimonious and effective device for high-dimensional sparse model parameters. Allowing both the dimensionalities of model parameters and estimating equations growing exponentially with the sample size, our theory demonstrates that the estimator from our new penalized EL is sparse and consistent with asymptotically normally distributed nonzero components. Numerical simulations and a real data analysis show that the proposed penalized EL works promisingly

    Flexible Variable Selection for Recovering Sparsity in Nonadditive Nonparametric Models

    Full text link
    Variable selection for recovering sparsity in nonadditive nonparametric models has been challenging. This problem becomes even more difficult due to complications in modeling unknown interaction terms among high dimensional variables. There is currently no variable selection method to overcome these limitations. Hence, in this paper we propose a variable selection approach that is developed by connecting a kernel machine with the nonparametric multiple regression model. The advantages of our approach are that it can: (1) recover the sparsity, (2) automatically model unknown and complicated interactions, (3) connect with several existing approaches including linear nonnegative garrote, kernel learning and automatic relevant determinants (ARD), and (4) provide flexibility for both additive and nonadditive nonparametric models. Our approach may be viewed as a nonlinear version of a nonnegative garrote method. We model the smoothing function by a least squares kernel machine and construct the nonnegative garrote objective function as the function of the similarity matrix. Since the multiple regression similarity matrix can be written as an additive form of univariate similarity matrices corresponding to input variables, applying a sparse scale parameter on each univariate similarity matrix can reveal its relevance to the response variable. We also derive the asymptotic properties of our approach, and show that it provides a square root consistent estimator of the scale parameters. Furthermore, we prove that sparsistency is satisfied with consistent initial kernel function coefficients under certain conditions and give the necessary and sufficient conditions for sparsistency. An efficient coordinate descent/backfitting algorithm is developed. A resampling procedure for our variable selection methodology is also proposed to improve power

    Joint Estimation of Camera Pose, Depth, Deblurring, and Super-Resolution from a Blurred Image Sequence

    Full text link
    The conventional methods for estimating camera poses and scene structures from severely blurry or low resolution images often result in failure. The off-the-shelf deblurring or super-resolution methods may show visually pleasing results. However, applying each technique independently before matching is generally unprofitable because this naive series of procedures ignores the consistency between images. In this paper, we propose a pioneering unified framework that solves four problems simultaneously, namely, dense depth reconstruction, camera pose estimation, super-resolution, and deblurring. By reflecting a physical imaging process, we formulate a cost minimization problem and solve it using an alternating optimization technique. The experimental results on both synthetic and real videos show high-quality depth maps derived from severely degraded images that contrast the failures of naive multi-view stereo methods. Our proposed method also produces outstanding deblurred and super-resolved images unlike the independent application or combination of conventional video deblurring, super-resolution methods.Comment: accepted to ICCV 201

    Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge

    Full text link
    This paper develops a new scalable sparse Cox regression tool for sparse high-dimensional massive sample size (sHDMSS) survival data. The method is a local L0L_0-penalized Cox regression via repeatedly performing reweighted L2L_2-penalized Cox regression. We show that the resulting estimator enjoys the best of L0L_0- and L2L_2-penalized Cox regressions while overcoming their limitations. Specifically, the estimator is selection consistent, oracle for parameter estimation, and possesses a grouping property for highly correlated covariates. Simulation results suggest that when the sample size is large, the proposed method with pre-specified tuning parameters has a comparable or better performance than some popular penalized regression methods. More importantly, because the method naturally enables adaptation of efficient algorithms for massive L2L_2-penalized optimization and does not require costly data driven tuning parameter selection, it has a significant computational advantage for sHDMSS data, offering an average of 5-fold speedup over its closest competitor in empirical studies

    Estimation of oblique structure via penalized likelihood factor analysis

    Full text link
    We consider the problem of sparse estimation via a lasso-type penalized likelihood procedure in a factor analysis model. Typically, the model estimation is done under the assumption that the common factors are orthogonal (uncorrelated). However, the lasso-type penalization method based on the orthogonal model can often estimate a completely different model from that with the true factor structure when the common factors are correlated. In order to overcome this problem, we propose to incorporate a factor correlation into the model, and estimate the factor correlation along with parameters included in the orthogonal model by maximum penalized likelihood procedure. An entire solution path is computed by the EM algorithm with coordinate descent, which permits the application to a wide variety of convex and nonconvex penalties. The proposed method can provide sufficiently sparse solutions, and be applied to the data where the number of variables is larger than the number of observations. Monte Carlo simulations are conducted to investigate the effectiveness of our modeling strategies. The results show that the lasso-type penalization based on the orthogonal model cannot often approximate the true factor structure, whereas our approach performs well in various situations. The usefulness of the proposed procedure is also illustrated through the analysis of real data.Comment: 19 pages. arXiv admin note: substantial text overlap with arXiv:1205.586
    • …
    corecore