Search CORE

1,379 research outputs found

Regularization Paths for Generalized Linear Models via Coordinate Descent

Author: Jerome H. Friedman
Rob Tibshirani
Trevor Hastie
Publication venue
Publication date
Field of study

We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multi- nomial regression problems while the penalties include Ã¢ÂÂ_1 (the lasso), Ã¢ÂÂ_2 (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.

Research Papers in Economics

Linear Bandits with Feature Feedback

Author: Bhargava Aniruddha
Nowak Robert
Oswal Urvashi
Publication venue
Publication date: 11/03/2019
Field of study

This paper explores a new form of the linear bandit problem in which the algorithm receives the usual stochastic rewards as well as stochastic feedback about which features are relevant to the rewards, the latter feedback being the novel aspect. The focus of this paper is the development of new theory and algorithms for linear bandits with feature feedback. We show that linear bandits with feature feedback can achieve regret over time horizon

T

that scales like

k\sqrt{T}

, without prior knowledge of which features are relevant nor the number

k

of relevant features. In comparison, the regret of traditional linear bandits is

d\sqrt{T}

, where

d

is the total number of (relevant and irrelevant) features, so the improvement can be dramatic if

k\ll d

. The computational complexity of the new algorithm is proportional to

k

rather than

d

, making it much more suitable for real-world applications compared to traditional linear bandits. We demonstrate the performance of the new algorithm with synthetic and real human-labeled data

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge

Author: Kawaguchi Eric S.
Li Gang
Liu Zhenqiu
Suchard Marc A.
Publication venue: 'Wiley'
Publication date: 25/07/2018
Field of study

This paper develops a new scalable sparse Cox regression tool for sparse high-dimensional massive sample size (sHDMSS) survival data. The method is a local

L_0

-penalized Cox regression via repeatedly performing reweighted

L_2

-penalized Cox regression. We show that the resulting estimator enjoys the best of

L_0

- and

L_2

-penalized Cox regressions while overcoming their limitations. Specifically, the estimator is selection consistent, oracle for parameter estimation, and possesses a grouping property for highly correlated covariates. Simulation results suggest that when the sample size is large, the proposed method with pre-specified tuning parameters has a comparable or better performance than some popular penalized regression methods. More importantly, because the method naturally enables adaptation of efficient algorithms for massive

L_2

-penalized optimization and does not require costly data driven tuning parameter selection, it has a significant computational advantage for sHDMSS data, offering an average of 5-fold speedup over its closest competitor in empirical studies

arXiv.org e-Print Archive

eScholarship - University of California

HIPAD - A Hybrid Interior-Point Alternating Direction algorithm for knowledge-based SVM and feature selection

Author: D Gabay
D Lewis
E Gertz
F Lauer
H Zou
J Eckstein
J Shi
K Koh
L Wang
P Combettes
PM Pardalos
R Rockafellar
R Tibshirani
RL Iman
S Boyd
S Mizuno
S Ryali
S Sra
V Vapnik
Publication venue
Publication date: 16/11/2014
Field of study

We consider classification tasks in the regime of scarce labeled training data in high dimensional feature space, where specific expert knowledge is also available. We propose a new hybrid optimization algorithm that solves the elastic-net support vector machine (SVM) through an alternating direction method of multipliers in the first phase, followed by an interior-point method for the classical SVM in the second phase. Both SVM formulations are adapted to knowledge incorporation. Our proposed algorithm addresses the challenges of automatic feature selection, high optimization accuracy, and algorithmic flexibility for taking advantage of prior knowledge. We demonstrate the effectiveness and efficiency of our algorithm and compare it with existing methods on a collection of synthetic and real-world data.Comment: Proceedings of 8th Learning and Intelligent OptimizatioN (LION8) Conference, 201

arXiv.org e-Print Archive

Crossref

Feature selection guided by structural information

Author: Castell Wolfgang zu
Slawski Martin
Tutz Gerhard
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 09/03/2009
Field of study

In generalized linear regression problems with an abundant number of features, lasso-type regularization which imposes an

\ell^1

-constraint on the regression coefficients has become a widely established technique. Deficiencies of the lasso in certain scenarios, notably strongly correlated design, were unmasked when Zou and Hastie [J. Roy. Statist. Soc. Ser. B 67 (2005) 301--320] introduced the elastic net. In this paper we propose to extend the elastic net by admitting general nonnegative quadratic constraints as a second form of regularization. The generalized ridge-type constraint will typically make use of the known association structure of features, for example, by using temporal- or spatial closeness. We study properties of the resulting "structured elastic net" regression estimation procedure, including basic asymptotics and the issue of model selection consistency. In this vein, we provide an analog to the so-called "irrepresentable condition" which holds for the lasso. Moreover, we outline algorithmic solutions for the structured elastic net within the generalized linear model family. The rationale and the performance of our approach is illustrated by means of simulated and real world data, with a focus on signal regression.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS302 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Open Access LMU

PuSH