Search CORE

11,163 research outputs found

Variable selection via Lasso with high-dimensional proteomic data

Author: Zhai Hongxuan
Publication venue: Washington University Open Scholarship
Publication date: 18/05/2018
Field of study

Multiclass classification with high-dimensional data is an applied topic both in statistics and machine learning. The classification procedure could be done in various ways. In this thesis, we review the theory of the Lasso procedure which provides a parameter estimator while simultaneously achieving dimension reduction due to a property of the L1 norm. Lasso with elastic net penalty and sparse group lasso are also reviewed. Our data is high-dimensional proteomic data (iTRAQ ratios) of breast cancer patients with four subtypes of breast cancer. We use the multinomial logistic regression to train our classifier and use the false classification rates obtained from cross validation to compare models

Washington University St. Louis: Open Scholarship

Adaptive sparse group LASSO in quantile regression

Author: Aguilera-Morillo M. Carmen
Lillo Rosa E.
Mendez-Civieta Alvaro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2021
Field of study

[EN] This paper studies the introduction of sparse group LASSO (SGL) to the quantile regression framework. Additionally, a more flexible version, an adaptive SGL is proposed based on the adaptive idea, this is, the usage of adaptive weights in the penalization. Adaptive estimators are usually focused on the study of the oracle property under asymptotic and double asymptotic frameworks. A key step on the demonstration of this property is to consider adaptive weights based on a initial root n-consistent estimator. In practice this implies the usage of a non penalized estimator that limits the adaptive solutions to low dimensional scenarios. In this work, several solutions, based on dimension reduction techniques PCA and PLS, are studied for the calculation of these weights in high dimensional frameworks. The benefits of this proposal are studied both in synthetic and real datasets.We appreciate the work of the referees that has contributed to substantially improve the scientific contributions of this work. In this research we have made use of Uranus, a supercomputer cluster located at University Carlos III of Madrid and funded jointly by EU-FEDER funds and by the Spanish Government via the National Projects No. UNC313-4E-2361, No. ENE2009-12213- C03-03, No. ENE2012-33219 and No. ENE2015-68265-P. This research was partially supported by research grants and Project ECO2015-66593-P from Ministerio de Economia, Industria y Competitividad, Project MTM2017-88708-P from Ministerio de Economia y Competitividad, FEDER funds and Project IJCI-2017-34038 from Agencia Estatal de Investigacion, Ministerio de Ciencia, Innovacion y Universidades.Mendez-Civieta, A.; Aguilera-Morillo, MC.; Lillo, RE. (2021). Adaptive sparse group LASSO in quantile regression. Advances in Data Analysis and Classification. 15:547-573. https://doi.org/10.1007/s11634-020-00413-8S54757315Chatterjee S, Banerjee, Arindam S, Ganguly AR (2011) Sparse Group Lasso for regression on land climate variables. In: IEEE 11th international conference on data mining workshops. IEEE, pp 1–8Chiang AP, Beck JS, Yen H-J, Tayeh MK, Scheetz TE, Swiderski RE, Nishimura DY, Braun TA, Kim K-YA, Huang J, Elbedour K, Carmi R, Slusarski DC, Casavant TL, Stone EM, Sheffield VC (2006) Homozygosity mapping with SNP arrays identifies TRIM32, an E3 ubiquitin ligase, as a Bardet-Biedl syndrome gene (BBS11). Proc Natl Acad Sci 103(16):6287–6292Chun H, Keleş S (2010) Sparse partial least squares regression for simultaneous dimension reduction and variable selection. J R Stat Soc Ser B Stat Methodol 72(1):3–25Ciuperca G (2017) Adaptive fused LASSO in grouped quantile regression. J Stat Theory Pract 11(1):107–125Ciuperca G (2019) Adaptive group LASSO selection in quantile models. Stat Pap 60(1):173–197Diamond S, Boyd S (2016) CVXPY: a Python-embedded modeling language for convex optimization. arXiv:1603.00943Domahidi A, Chu E, Boyd S (2013) ECOS: an SOCP solver for embedded systems. In: European control conference (ECC)Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360Fan J, Peng H (2004) Nonconcave penalized likelihood with a diverging number of parameters. Ann Stat 32(3):928–961Friedman J, Hastie T, Tibshirani R (2010) A note on the group lasso and a sparse group lasso, pp 1–8. ArXiv:1001.0736Ghosh S (2011) On the grouped selection and model complexity of the adaptive elastic net. Stat Comput 21:451–462Huang J, Horowitz JL, Ma S (2008a) Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann Stat 36(2):587–613Huang J, Ma S, Zhang C-H (2008b) Adaptive Lasso for sparse high-dimensional regression. Stat Sin 1(374):1–28Huber PJ, Ronchetti EM (2009) Robust statistics. Wiley series in probability and statistics, 2nd edn. Wiley, HobokenKim Y, Choi H, Oh HS (2008) Smoothly clipped absolute deviation on high dimensions. J Am Stat Assoc 103(484):1665–1673Koenker R (2005) Quantile regression. Cambridge University Press, CambridgeKoenker R, Bassett G (1978) Regression quantiles. Econometrica 46(1):33–50Laria JC, Aguilera-Morillo MC, Lillo RE (2019) An iterative sparse-group Lasso. J Comput Graph Stat 28:722–731Li Y, Zhu J (2008) L

_1

-Norm quantile regression. J Comput Graph Stat 17(1):1–23Loh PL (2017) Statistical consistency and asymptotic normality for high-dimensional robust m-estimators. Ann Stat 45(2):866–896Nardi Y, Rinaldo A (2008) On the asymptotic properties of the group lasso estimator for linear models. Electron J Stat 2:605–633Poignard B (2018) Asymptotic theory of the adaptive Sparse Group Lasso. Ann Inst Stat Math 72:297–328Scheetz TE, Kim K-YA, Swiderski RE, Philp AR, Braun TA, Knudtson KL, Dorrance AM, DiBona GF, Huang J, Casavant TL, Sheffield VC, Stone EM (2006) Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proc Natl Acad Sci 103(39):14429–14434Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Stat 22(2):231–245Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102(43):15545–15550Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58(1):267–288Wang L, Wu Y, Li R (2012) Quantile regression for analyzing heterogeneity in ultra-high dimension. J Am Stat Assoc 107(497):214–222Wright J, Ma Y, Mairal J, Sapiro G, Huang TS, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98(6):1031–1044Wu Y, Liu Y (2009) Variable selection in quantile regression. Stat Sin 19(2):801–817Yahya Algamal Z, Hisyam Lee M (2019) A two-stage sparse logistic regression for optimal gene selection in high-dimensional microarray data classification. Adv Data Anal Classif 13:753–771Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Methodol) 68(1):49–67Zhao W, Zhang R, Liu J (2014) Sparse group variable selection based on quantile hierarchical Lasso. J Appl Stat 41(8):1658–1677Zhou N, Zhu J (2010) Group variable selection via a hierarchical lasso and its oracle property. Stat Interface 3:557–574Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Stat 15(2):265–28

RiuNet