Search CORE

7,251 research outputs found

Feature Grouping Using Weighted L1 Norm For High-Dimensional Data

Author: Padthe` Karthik Kumar
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2016
Field of study

Building effective prediction models from high-dimensional data is an important problem in several domains such as in bioinformatics, healthcare analytics and general regression analysis. Extracting feature groups automatically from such data with several correlated features is necessary, in order to use regularizers such as the group lasso which can exploit this deciphered grouping structure to build effective prediction models. Elastic net, fused-lasso and Octagonal Shrinkage Clustering Algorithm for Regression (oscar) are some of the popular feature grouping methods proposed in the literature which recover both sparsity and feature groups from the data. However, their predictive ability is affected adversely when the regression coefficients of adjacent feature groups are similar, but not exactly equal. This happens as these methods merge such adjacent feature groups erroneously, which is widely known as the misfusion problem. In order to solve this problem, in this thesis, we propose a weighted L1 norm-based approach which is effective at recovering feature groups, despite the proximity of the coefficients of adjacent feature groups, building extremely accurate prediction models. This convex optimization problem is solved using the fast iterative soft-thresholding algorithm (FISTA). We depict how our approach is more successful than competing feature grouping methods such as the elastic net, fused-lasso and oscar at solving the misfusion problem on synthetic datasets. We also compare the goodness of prediction of our algorithm against state-of-the-art non-convex feature grouping methods when applied on a real-world breast cancer dataset, the 20-Newsgroups dataset and synthetic datasets

Digital Commons@Wayne State University

Feature Grouping Using Weighted L1 Norm For High-Dimensional Data

Author: Curi Edda
Maciel Maria Delourdes
Pereira Carlos Luis
Publication venue: DigitalCommons@WayneState
Publication date: 20/01/2014
Field of study

Digital Commons@Wayne State University

Red de Bibliotecas Virtuales de Ciencias Sociales de América Latina y El Caribe

Learning Credible Models

Author: Lipton Zachary C
Silva Ikaro
Sun Jimeng
Velikova Marina
Wiens Jenna
Zhao Peng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/06/2018
Field of study

In many settings, it is important that a model be capable of providing reasons for its predictions (i.e., the model must be interpretable). However, the model's reasoning may not conform with well-established knowledge. In such cases, while interpretable, the model lacks \textit{credibility}. In this work, we formally define credibility in the linear setting and focus on techniques for learning models that are both accurate and credible. In particular, we propose a regularization penalty, expert yielded estimates (EYE), that incorporates expert knowledge about well-known relationships among covariates and the outcome of interest. We give both theoretical and empirical results comparing our proposed method to several other regularization techniques. Across a range of settings, experiments on both synthetic and real data show that models learned using the EYE penalty are significantly more credible than those learned using other penalties. Applied to a large-scale patient risk stratification task, our proposed technique results in a model whose top features overlap significantly with known clinical risk factors, while still achieving good predictive performance

arXiv.org e-Print Archive

Crossref

Manifold Elastic Net: A Unified Framework for Sparse Dimension Reduction

Author: A D’aspremont
B Efron
C Fyfe
CHQ Ding
CM Bishop
D Cai
D Tao
D Tao
D Tao
Dacheng Tao
DB Graham
DL Donoho
E Candes
EJ Candes
GH Golub
GM James
H Hotelling
H Zou
H Zou
H Zou
H Zou
HP Kriegel
J Fan
J Fan
J Lv
JB Tenenbaum
M Belkin
M Belkin
PJ Phillips
PN Belhumeur
R Tibshirani
RA Fisher
S Yan
ST Roweis
T Li
T Zhang
Tianyi Zhou
X He
X Li
Xindong Wu
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/07/2010
Field of study

It is difficult to find the optimal sparse solution of a manifold learning based dimensionality reduction algorithm. The lasso or the elastic net penalized manifold learning based dimensionality reduction is not directly a lasso penalized least square problem and thus the least angle regression (LARS) (Efron et al. \cite{LARS}), one of the most popular algorithms in sparse learning, cannot be applied. Therefore, most current approaches take indirect ways or have strict settings, which can be inconvenient for applications. In this paper, we proposed the manifold elastic net or MEN for short. MEN incorporates the merits of both the manifold learning based dimensionality reduction and the sparse learning based dimensionality reduction. By using a series of equivalent transformations, we show MEN is equivalent to the lasso penalized least square problem and thus LARS is adopted to obtain the optimal sparse solution of MEN. In particular, MEN has the following advantages for subsequent classification: 1) the local geometry of samples is well preserved for low dimensional data representation, 2) both the margin maximization and the classification error minimization are considered for sparse projection calculation, 3) the projection matrix of MEN improves the parsimony in computation, 4) the elastic net penalty reduces the over-fitting problem, and 5) the projection matrix of MEN can be interpreted psychologically and physiologically. Experimental evidence on face recognition over various popular datasets suggests that MEN is superior to top level dimensionality reduction algorithms.Comment: 33 pages, 12 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

OPUS - University of Technology Sydney

Combining Quadratic Penalization and Variable Selection via Forward Boosting

Author: Tutz Gerhard
Ulbricht Jan
Publication venue
Publication date: 01/01/2011
Field of study

Quadratic penalties can be used to incorporate external knowledge about the association structure among regressors. Unfortunately, they do not enforce single estimated regression coefficients to equal zero. In this paper we propose a new approach to combine quadratic penalization and variable selection within the framework of generalized linear models. The new method is called Forward Boosting and is related to componentwise boosting techniques. We demonstrate in simulation studies and a real-world data example that the new approach competes well with existing alternatives especially when the focus is on interpretable structuring of predictors

Open Access LMU