2,374 research outputs found

    Pathway-Based Genomics Prediction using Generalized Elastic Net.

    Get PDF
    We present a novel regularization scheme called The Generalized Elastic Net (GELnet) that incorporates gene pathway information into feature selection. The proposed formulation is applicable to a wide variety of problems in which the interpretation of predictive features using known molecular interactions is desired. The method naturally steers solutions toward sets of mechanistically interlinked genes. Using experiments on synthetic data, we demonstrate that pathway-guided results maintain, and often improve, the accuracy of predictors even in cases where the full gene network is unknown. We apply the method to predict the drug response of breast cancer cell lines. GELnet is able to reveal genetic determinants of sensitivity and resistance for several compounds. In particular, for an EGFR/HER2 inhibitor, it finds a possible trans-differentiation resistance mechanism missed by the corresponding pathway agnostic approach

    Effective Discriminative Feature Selection with Non-trivial Solutions

    Full text link
    Feature selection and feature transformation, the two main ways to reduce dimensionality, are often presented separately. In this paper, a feature selection method is proposed by combining the popular transformation based dimensionality reduction method Linear Discriminant Analysis (LDA) and sparsity regularization. We impose row sparsity on the transformation matrix of LDA through ℓ2,1{\ell}_{2,1}-norm regularization to achieve feature selection, and the resultant formulation optimizes for selecting the most discriminative features and removing the redundant ones simultaneously. The formulation is extended to the ℓ2,p{\ell}_{2,p}-norm regularized case: which is more likely to offer better sparsity when 0<p<10<p<1. Thus the formulation is a better approximation to the feature selection problem. An efficient algorithm is developed to solve the ℓ2,p{\ell}_{2,p}-norm based optimization problem and it is proved that the algorithm converges when 0<p≤20<p\le 2. Systematical experiments are conducted to understand the work of the proposed method. Promising experimental results on various types of real-world data sets demonstrate the effectiveness of our algorithm

    Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

    Full text link
    Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

    Contribution to Graph-based Manifold Learning with Application to Image Categorization.

    Get PDF
    122 pLos algoritmos de aprendizaje de variedades basados en grafos (Graph,based manifold) son técnicas que han demostrado ser potentes herramientas para la extracción de características y la reducción de la dimensionalidad en los campos de reconomiento de patrones, visión por computador y aprendizaje automático. Estos algoritmos utilizan información basada en las similitudes de pares de muestras y del grafo ponderado resultante para revelar la estructura geométrica intrínseca de la variedad
    • …
    corecore