6,347 research outputs found
On the adaptive elastic-net with a diverging number of parameters
We consider the problem of model selection and estimation in situations where
the number of parameters diverges with the sample size. When the dimension is
high, an ideal method should have the oracle property [J. Amer. Statist. Assoc.
96 (2001) 1348--1360] and [Ann. Statist. 32 (2004) 928--961] which ensures the
optimal large sample performance. Furthermore, the high-dimensionality often
induces the collinearity problem, which should be properly handled by the ideal
method. Many existing variable selection methods fail to achieve both goals
simultaneously. In this paper, we propose the adaptive elastic-net that
combines the strengths of the quadratic regularization and the adaptively
weighted lasso shrinkage. Under weak regularity conditions, we establish the
oracle property of the adaptive elastic-net. We show by simulations that the
adaptive elastic-net deals with the collinearity problem better than the other
oracle-like methods, thus enjoying much improved finite sample performance.Comment: Published in at http://dx.doi.org/10.1214/08-AOS625 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Component selection and smoothing in multivariate nonparametric regression
We propose a new method for model selection and model fitting in multivariate
nonparametric regression models, in the framework of smoothing spline ANOVA.
The ``COSSO'' is a method of regularization with the penalty functional being
the sum of component norms, instead of the squared norm employed in the
traditional smoothing spline method. The COSSO provides a unified framework for
several recent proposals for model selection in linear models and smoothing
spline ANOVA models. Theoretical properties, such as the existence and the rate
of convergence of the COSSO estimator, are studied. In the special case of a
tensor product design with periodic functions, a detailed analysis reveals that
the COSSO does model selection by applying a novel soft thresholding type
operation to the function components. We give an equivalent formulation of the
COSSO estimator which leads naturally to an iterative algorithm. We compare the
COSSO with MARS, a popular method that builds functional ANOVA models, in
simulations and real examples. The COSSO method can be extended to
classification problems and we compare its performance with those of a number
of machine learning algorithms on real datasets. The COSSO gives very
competitive performance in these studies.Comment: Published at http://dx.doi.org/10.1214/009053606000000722 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Model Selection for High Dimensional Quadratic Regression via Regularization
Quadratic regression (QR) models naturally extend linear models by
considering interaction effects between the covariates. To conduct model
selection in QR, it is important to maintain the hierarchical model structure
between main effects and interaction effects. Existing regularization methods
generally achieve this goal by solving complex optimization problems, which
usually demands high computational cost and hence are not feasible for high
dimensional data. This paper focuses on scalable regularization methods for
model selection in high dimensional QR. We first consider two-stage
regularization methods and establish theoretical properties of the two-stage
LASSO. Then, a new regularization method, called Regularization Algorithm under
Marginality Principle (RAMP), is proposed to compute a hierarchy-preserving
regularization solution path efficiently. Both methods are further extended to
solve generalized QR models. Numerical results are also shown to demonstrate
performance of the methods.Comment: 37 pages, 1 figure with supplementary materia
Variable selection for the multicategory SVM via adaptive sup-norm regularization
The Support Vector Machine (SVM) is a popular classification paradigm in
machine learning and has achieved great success in real applications. However,
the standard SVM can not select variables automatically and therefore its
solution typically utilizes all the input variables without discrimination.
This makes it difficult to identify important predictor variables, which is
often one of the primary goals in data analysis. In this paper, we propose two
novel types of regularization in the context of the multicategory SVM (MSVM)
for simultaneous classification and variable selection. The MSVM generally
requires estimation of multiple discriminating functions and applies the argmax
rule for prediction. For each individual variable, we propose to characterize
its importance by the supnorm of its coefficient vector associated with
different functions, and then minimize the MSVM hinge loss function subject to
a penalty on the sum of supnorms. To further improve the supnorm penalty, we
propose the adaptive regularization, which allows different weights imposed on
different variables according to their relative importance. Both types of
regularization automate variable selection in the process of building
classifiers, and lead to sparse multi-classifiers with enhanced
interpretability and improved accuracy, especially for high dimensional low
sample size data. One big advantage of the supnorm penalty is its easy
implementation via standard linear programming. Several simulated examples and
one real gene data analysis demonstrate the outstanding performance of the
adaptive supnorm penalty in various data settings.Comment: Published in at http://dx.doi.org/10.1214/08-EJS122 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Modeling and analysis of a reverse supply chain network for lead-acid battery manufacturing.
The North American lead-acid battery industry gains its environmental edge from its employment of closed-loop life cycle production. Nowadays, the typical new lead-acid battery contains 60 to 80 percent recycled lead and plastics. In this thesis, the closed-loop supply chain of a lead-acid battery manufacturing process has been investigated which extends the traditional supply chain to the entire product life cycle. A new tactical planning model has been developed for the entire closed-loop manufacturing process including purchasing, production, and end-of-life product return and recycling. The model is a multi-objective, multi-echelon mixed integer linear programming model, which minimizes the total costs and the total transportation pollution emissions, subject to structural and functional constraints. Decisions are made regarding material procurement, production, recycling and inventory levels, and the transportation modes between the echelons. Sensitivity analysis has been performed to evaluate the integration with third party outsourcing, changes in parameters and design options.* *This dissertation is a compound document (contains both a paper copy and a CD as part of the dissertation).Dept. of Industrial and Manufacturing Systems Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2006 .Z435. Source: Masters Abstracts International, Volume: 45-01, page: 0440. Thesis (M.A.Sc.)--University of Windsor (Canada), 2006
- …