1,399 research outputs found
Non-Concave Penalized Likelihood with NP-Dimensionality
Penalized likelihood methods are fundamental to ultra-high dimensional
variable selection. How high dimensionality such methods can handle remains
largely unknown. In this paper, we show that in the context of generalized
linear models, such methods possess model selection consistency with oracle
properties even for dimensionality of Non-Polynomial (NP) order of sample size,
for a class of penalized likelihood approaches using folded-concave penalty
functions, which were introduced to ameliorate the bias problems of convex
penalty functions. This fills a long-standing gap in the literature where the
dimensionality is allowed to grow slowly with the sample size. Our results are
also applicable to penalized likelihood with the -penalty, which is a
convex function at the boundary of the class of folded-concave penalty
functions under consideration. The coordinate optimization is implemented for
finding the solution paths, whose performance is evaluated by a few simulation
examples and the real data analysis.Comment: 37 pages, 2 figure
Coordinate-independent sparse sufficient dimension reduction and variable selection
Sufficient dimension reduction (SDR) in regression, which reduces the
dimension by replacing original predictors with a minimal set of their linear
combinations without loss of information, is very helpful when the number of
predictors is large. The standard SDR methods suffer because the estimated
linear combinations usually consist of all original predictors, making it
difficult to interpret. In this paper, we propose a unified method -
coordinate-independent sparse estimation (CISE) - that can simultaneously
achieve sparse sufficient dimension reduction and screen out irrelevant and
redundant variables efficiently. CISE is subspace oriented in the sense that it
incorporates a coordinate-independent penalty term with a broad series of
model-based and model-free SDR approaches. This results in a Grassmann manifold
optimization problem and a fast algorithm is suggested. Under mild conditions,
based on manifold theories and techniques, it can be shown that CISE would
perform asymptotically as well as if the true irrelevant predictors were known,
which is referred to as the oracle property. Simulation studies and a real-data
example demonstrate the effectiveness and efficiency of the proposed approach.Comment: Published in at http://dx.doi.org/10.1214/10-AOS826 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Group variable selection via convex Log-Exp-Sum penalty with application to a breast cancer survivor study
In many scientific and engineering applications, covariates are naturally
grouped. When the group structures are available among covariates, people are
usually interested in identifying both important groups and important variables
within the selected groups. Among existing successful group variable selection
methods, some methods fail to conduct the within group selection. Some methods
are able to conduct both group and within group selection, but the
corresponding objective functions are non-convex. Such a non-convexity may
require extra numerical effort. In this paper, we propose a novel
Log-Exp-Sum(LES) penalty for group variable selection. The LES penalty is
strictly convex. It can identify important groups as well as select important
variables within the group. We develop an efficient group-level coordinate
descent algorithm to fit the model. We also derive non-asymptotic error bounds
and asymptotic group selection consistency for our method in the
high-dimensional setting where the number of covariates can be much larger than
the sample size. Numerical results demonstrate the good performance of our
method in both variable selection and prediction. We applied the proposed
method to an American Cancer Society breast cancer survivor dataset. The
findings are clinically meaningful and lead immediately to testable clinical
hypotheses
Sparse and Functional Principal Components Analysis
Regularized variants of Principal Components Analysis, especially Sparse PCA
and Functional PCA, are among the most useful tools for the analysis of complex
high-dimensional data. Many examples of massive data, have both sparse and
functional (smooth) aspects and may benefit from a regularization scheme that
can capture both forms of structure. For example, in neuro-imaging data, the
brain's response to a stimulus may be restricted to a discrete region of
activation (spatial sparsity), while exhibiting a smooth response within that
region. We propose a unified approach to regularized PCA which can induce both
sparsity and smoothness in both the row and column principal components. Our
framework generalizes much of the previous literature, with sparse, functional,
two-way sparse, and two-way functional PCA all being special cases of our
approach. Our method permits flexible combinations of sparsity and smoothness
that lead to improvements in feature selection and signal recovery, as well as
more interpretable PCA factors. We demonstrate the efficacy of our method on
simulated data and a neuroimaging example on EEG data.Comment: The published version of this paper incorrectly thanks "Luofeng Luo"
instead of "Luofeng Liao" in the Acknowledgement
Concave Penalized Estimation of Sparse Gaussian Bayesian Networks
We develop a penalized likelihood estimation framework to estimate the
structure of Gaussian Bayesian networks from observational data. In contrast to
recent methods which accelerate the learning problem by restricting the search
space, our main contribution is a fast algorithm for score-based structure
learning which does not restrict the search space in any way and works on
high-dimensional datasets with thousands of variables. Our use of concave
regularization, as opposed to the more popular (e.g. BIC) penalty, is
new. Moreover, we provide theoretical guarantees which generalize existing
asymptotic results when the underlying distribution is Gaussian. Most notably,
our framework does not require the existence of a so-called faithful DAG
representation, and as a result the theory must handle the inherent
nonidentifiability of the estimation problem in a novel way. Finally, as a
matter of independent interest, we provide a comprehensive comparison of our
approach to several standard structure learning methods using open-source
packages developed for the R language. Based on these experiments, we show that
our algorithm is significantly faster than other competing methods while
obtaining higher sensitivity with comparable false discovery rates for
high-dimensional data. In particular, the total runtime for our method to
generate a solution path of 20 estimates for DAGs with 8000 nodes is around one
hour.Comment: 57 page
A new scope of penalized empirical likelihood with high-dimensional estimating equations
Statistical methods with empirical likelihood (EL) are appealing and
effective especially in conjunction with estimating equations through which
useful data information can be adaptively and flexibly incorporated. It is also
known in the literature that EL approaches encounter difficulties when dealing
with problems having high-dimensional model parameters and estimating
equations. To overcome the challenges, we begin our study with a careful
investigation on high-dimensional EL from a new scope targeting at estimating a
high-dimensional sparse model parameters. We show that the new scope provides
an opportunity for relaxing the stringent requirement on the dimensionality of
the model parameter. Motivated by the new scope, we then propose a new
penalized EL by applying two penalty functions respectively regularizing the
model parameters and the associated Lagrange multipliers in the optimizations
of EL. By penalizing the Lagrange multiplier to encourage its sparsity, we show
that drastic dimension reduction in the number of estimating equations can be
effectively achieved without compromising the validity and consistency of the
resulting estimators. Most attractively, such a reduction in dimensionality of
estimating equations is actually equivalent to a selection among those
high-dimensional estimating equations, resulting in a highly parsimonious and
effective device for high-dimensional sparse model parameters. Allowing both
the dimensionalities of model parameters and estimating equations growing
exponentially with the sample size, our theory demonstrates that the estimator
from our new penalized EL is sparse and consistent with asymptotically normally
distributed nonzero components. Numerical simulations and a real data analysis
show that the proposed penalized EL works promisingly
Flexible Variable Selection for Recovering Sparsity in Nonadditive Nonparametric Models
Variable selection for recovering sparsity in nonadditive nonparametric
models has been challenging. This problem becomes even more difficult due to
complications in modeling unknown interaction terms among high dimensional
variables. There is currently no variable selection method to overcome these
limitations. Hence, in this paper we propose a variable selection approach that
is developed by connecting a kernel machine with the nonparametric multiple
regression model. The advantages of our approach are that it can: (1) recover
the sparsity, (2) automatically model unknown and complicated interactions, (3)
connect with several existing approaches including linear nonnegative garrote,
kernel learning and automatic relevant determinants (ARD), and (4) provide
flexibility for both additive and nonadditive nonparametric models. Our
approach may be viewed as a nonlinear version of a nonnegative garrote method.
We model the smoothing function by a least squares kernel machine and construct
the nonnegative garrote objective function as the function of the similarity
matrix. Since the multiple regression similarity matrix can be written as an
additive form of univariate similarity matrices corresponding to input
variables, applying a sparse scale parameter on each univariate similarity
matrix can reveal its relevance to the response variable. We also derive the
asymptotic properties of our approach, and show that it provides a square root
consistent estimator of the scale parameters. Furthermore, we prove that
sparsistency is satisfied with consistent initial kernel function coefficients
under certain conditions and give the necessary and sufficient conditions for
sparsistency. An efficient coordinate descent/backfitting algorithm is
developed. A resampling procedure for our variable selection methodology is
also proposed to improve power
Joint Estimation of Camera Pose, Depth, Deblurring, and Super-Resolution from a Blurred Image Sequence
The conventional methods for estimating camera poses and scene structures
from severely blurry or low resolution images often result in failure. The
off-the-shelf deblurring or super-resolution methods may show visually pleasing
results. However, applying each technique independently before matching is
generally unprofitable because this naive series of procedures ignores the
consistency between images. In this paper, we propose a pioneering unified
framework that solves four problems simultaneously, namely, dense depth
reconstruction, camera pose estimation, super-resolution, and deblurring. By
reflecting a physical imaging process, we formulate a cost minimization problem
and solve it using an alternating optimization technique. The experimental
results on both synthetic and real videos show high-quality depth maps derived
from severely degraded images that contrast the failures of naive multi-view
stereo methods. Our proposed method also produces outstanding deblurred and
super-resolved images unlike the independent application or combination of
conventional video deblurring, super-resolution methods.Comment: accepted to ICCV 201
Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge
This paper develops a new scalable sparse Cox regression tool for sparse
high-dimensional massive sample size (sHDMSS) survival data. The method is a
local -penalized Cox regression via repeatedly performing reweighted
-penalized Cox regression. We show that the resulting estimator enjoys the
best of - and -penalized Cox regressions while overcoming their
limitations. Specifically, the estimator is selection consistent, oracle for
parameter estimation, and possesses a grouping property for highly correlated
covariates. Simulation results suggest that when the sample size is large, the
proposed method with pre-specified tuning parameters has a comparable or better
performance than some popular penalized regression methods. More importantly,
because the method naturally enables adaptation of efficient algorithms for
massive -penalized optimization and does not require costly data driven
tuning parameter selection, it has a significant computational advantage for
sHDMSS data, offering an average of 5-fold speedup over its closest competitor
in empirical studies
Estimation of oblique structure via penalized likelihood factor analysis
We consider the problem of sparse estimation via a lasso-type penalized
likelihood procedure in a factor analysis model. Typically, the model
estimation is done under the assumption that the common factors are orthogonal
(uncorrelated). However, the lasso-type penalization method based on the
orthogonal model can often estimate a completely different model from that with
the true factor structure when the common factors are correlated. In order to
overcome this problem, we propose to incorporate a factor correlation into the
model, and estimate the factor correlation along with parameters included in
the orthogonal model by maximum penalized likelihood procedure. An entire
solution path is computed by the EM algorithm with coordinate descent, which
permits the application to a wide variety of convex and nonconvex penalties.
The proposed method can provide sufficiently sparse solutions, and be applied
to the data where the number of variables is larger than the number of
observations. Monte Carlo simulations are conducted to investigate the
effectiveness of our modeling strategies. The results show that the lasso-type
penalization based on the orthogonal model cannot often approximate the true
factor structure, whereas our approach performs well in various situations. The
usefulness of the proposed procedure is also illustrated through the analysis
of real data.Comment: 19 pages. arXiv admin note: substantial text overlap with
arXiv:1205.586
- β¦