161 research outputs found

    Relax and penalize: a new bilevel approach to mixed-binary hyperparameter optimization

    Full text link
    In recent years, bilevel approaches have become very popular to efficiently estimate high-dimensional hyperparameters of machine learning models. However, to date, binary parameters are handled by continuous relaxation and rounding strategies, which could lead to inconsistent solutions. In this context, we tackle the challenging optimization of mixed-binary hyperparameters by resorting to an equivalent continuous bilevel reformulation based on an appropriate penalty term. We propose an algorithmic framework that, under suitable assumptions, is guaranteed to provide mixed-binary solutions. Moreover, the generality of the method allows to safely use existing continuous bilevel solvers within the proposed framework. We evaluate the performance of our approach for a specific machine learning problem, i.e., the estimation of the group-sparsity structure in regression problems. Reported results clearly show that our method outperforms state-of-the-art approaches based on relaxation and roundin

    Incorporating Pathway Information into Feature Selection Towards Better Performed Gene Signatures

    Get PDF
    To analyze gene expression data with sophisticated grouping structures and to extract hidden patterns from such data, feature selection is of critical importance. It is well known that genes do not function in isolation but rather work together within various metabolic, regulatory, and signaling pathways. If the biological knowledge contained within these pathways is taken into account, the resulting method is a pathway-based algorithm. Studies have demonstrated that a pathway-based method usually outperforms its gene-based counterpart in which no biological knowledge is considered. In this article, a pathway-based feature selection is firstly divided into three major categories, namely, pathway-level selection, bilevel selection, and pathway-guided gene selection. With bilevel selection methods being regarded as a special case of pathway-guided gene selection process, we discuss pathway-guided gene selection methods in detail and the importance of penalization in such methods. Last, we point out the potential utilizations of pathway-guided gene selection in one active research avenue, namely, to analyze longitudinal gene expression data. We believe this article provides valuable insights for computational biologists and biostatisticians so that they can make biology more computable

    Prediction of cancer drug sensitivity using high-dimensional omic features

    Get PDF
    A large number of cancer drugs have been developed to target particular genes/pathways that are crucial for cancer growth. Drugs that share a molecular target may also have some common predictive omic features, e.g., somatic mutations or gene expression. Therefore, it is desirable to analyze these drugs as a group to identify the associated omic features, which may provide biological insights into the underlying drug response. Furthermore, these omic features may be robust predictors for any drug sharing the same target. The high dimensionality and the strong correlations among the omic features are the main challenges of this task. Motivated by this problem, we develop a new method for high-dimensional bilevel feature selection using a group of response variables that may share a common set of predictors in addition to their individual predictors. Simulation results show that our method has a substantially higher sensitivity and specificity than existing methods. We apply our method to two large-scale drug sensitivity studies in cancer cell lines. Both within-study and between-study validation demonstrate the good efficacy of our method

    Detection of rare functional variants using group ISIS

    Get PDF
    Genome-wide association studies have been firmly established in investigations of the associations between common genetic variants and complex traits or diseases. However, a large portion of complex traits and diseases cannot be explained well by common variants. Detecting rare functional variants becomes a trend and a necessity. Because rare variants have such a small minor allele frequency (e.g., <0.05), detecting functional rare variants is challenging. Group iterative sure independence screening (ISIS), a fast group selection tool, was developed to select important genes and the single-nucleotide polymorphisms within. The performance of the group ISIS and group penalization methods is compared for detecting important genes in the Genetic Analysis Workshop 17 data. The results suggest that the group ISIS is an efficient tool to discover genes and single-nucleotide polymorphisms associated to phenotypes

    Hyperparameter optimization with approximate gradient

    Full text link
    Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we propose an algorithm for the optimization of continuous hyperparameters using inexact gradient information. An advantage of this method is that hyperparameters can be updated before model parameters have fully converged. We also give sufficient conditions for the global convergence of this method, based on regularity conditions of the involved functions and summability of errors. Finally, we validate the empirical performance of this method on the estimation of regularization constants of L2-regularized logistic regression and kernel Ridge regression. Empirical benchmarks indicate that our approach is highly competitive with respect to state of the art methods.Comment: Proceedings of the International conference on Machine Learning (ICML
    corecore