Penalized Estimation Methods and Their Applications in Genomics and Beyond

Abstract

Various forms of penalty functions have been developed for regularized estimation. The tuning parameter(s) of a penalty function play a key role in penalizing all the noise to be zero and obtaining unbiased estimation of the true signals. For penalty functions with more than one tuning parameters, previous studies have not emphasized on the joint effect of all the tuning parameters. In the first topic, we conduct a theoretical analysis to relate the ranges of tuning parameters of penalty functions with the dimensionality of the problem and the minimum effect size. We exemplify our theoretical results in several well-known penalty functions. The results suggest that a class of penalty functions that bridges L0L_0 and L1L_1 penalties require less restrictive conditions for variable selection consistency. The simulation analysis and real data analysis support these theoretical results. For the second topic, we consider the problem of identifying genomic features to predict cancer drug sensitivity. Several drugs that share a molecular target may also have some common predictive features. Therefore, it is desirable to analyze these drugs as a group to identify the associated genomic features. Motivated by this problem, we develop a new method for high-dimensional feature selection using a group of responses that may share a common set of predictors in addition to their individual predictors. Simulation results show that our method has better performances than existing methods. Between-study validation in real data shows that the genomic features selected for a drug target can form good predictors for other drugs designed for the same target. For the third topic, we address an estimation problem where certain parameter values such as 0 would cause an identifiability issue. In the maximum likelihood estimation framework, due to the issue of the unidentifiable parameter, the maximum likelihood estimator have regular properties only if the likelihood function is specified correctly with respect to the parameter values. We propose a penalized estimation procedure using the adaptive Lasso penalty to address the potential identifiability issue. We study the asymptotic property of the proposed estimator and evaluate our method in extensive simulations and real data analysis.Doctor of Philosoph

    Similar works