885 research outputs found
Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection
A number of variable selection methods have been proposed involving nonconvex
penalty functions. These methods, which include the smoothly clipped absolute
deviation (SCAD) penalty and the minimax concave penalty (MCP), have been
demonstrated to have attractive theoretical properties, but model fitting is
not a straightforward task, and the resulting solutions may be unstable. Here,
we demonstrate the potential of coordinate descent algorithms for fitting these
models, establishing theoretical convergence properties and demonstrating that
they are significantly faster than competing approaches. In addition, we
demonstrate the utility of convexity diagnostics to determine regions of the
parameter space in which the objective function is locally convex, even though
the penalty is not. Our simulation study and data examples indicate that
nonconvex penalties like MCP and SCAD are worthwhile alternatives to the lasso
in many applications. In particular, our numerical results suggest that MCP is
the preferred approach among the three methods.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS388 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Learning Large-Scale Bayesian Networks with the sparsebn Package
Learning graphical models from data is an important problem with wide
applications, ranging from genomics to the social sciences. Nowadays datasets
often have upwards of thousands---sometimes tens or hundreds of thousands---of
variables and far fewer samples. To meet this challenge, we have developed a
new R package called sparsebn for learning the structure of large, sparse
graphical models with a focus on Bayesian networks. While there are many
existing software packages for this task, this package focuses on the unique
setting of learning large networks from high-dimensional data, possibly with
interventions. As such, the methods provided place a premium on scalability and
consistency in a high-dimensional setting. Furthermore, in the presence of
interventions, the methods implemented here achieve the goal of learning a
causal network from data. Additionally, the sparsebn package is fully
compatible with existing software packages for network analysis.Comment: To appear in the Journal of Statistical Software, 39 pages, 7 figure
Computational Methods for Sparse Solution of Linear Inverse Problems
The goal of the sparse approximation problem is to approximate a target signal using a linear combination of a few elementary signals drawn from a fixed collection. This paper surveys the major practical algorithms for sparse approximation. Specific attention is paid to computational issues, to the circumstances in which individual methods tend to perform well, and to the theoretical guarantees available. Many fundamental questions in electrical engineering, statistics, and applied mathematics can be posed as sparse approximation problems, making these algorithms versatile and relevant to a plethora of applications
- β¦