2,374 research outputs found
Pathway-Based Genomics Prediction using Generalized Elastic Net.
We present a novel regularization scheme called The Generalized Elastic Net (GELnet) that incorporates gene pathway information into feature selection. The proposed formulation is applicable to a wide variety of problems in which the interpretation of predictive features using known molecular interactions is desired. The method naturally steers solutions toward sets of mechanistically interlinked genes. Using experiments on synthetic data, we demonstrate that pathway-guided results maintain, and often improve, the accuracy of predictors even in cases where the full gene network is unknown. We apply the method to predict the drug response of breast cancer cell lines. GELnet is able to reveal genetic determinants of sensitivity and resistance for several compounds. In particular, for an EGFR/HER2 inhibitor, it finds a possible trans-differentiation resistance mechanism missed by the corresponding pathway agnostic approach
Effective Discriminative Feature Selection with Non-trivial Solutions
Feature selection and feature transformation, the two main ways to reduce
dimensionality, are often presented separately. In this paper, a feature
selection method is proposed by combining the popular transformation based
dimensionality reduction method Linear Discriminant Analysis (LDA) and sparsity
regularization. We impose row sparsity on the transformation matrix of LDA
through -norm regularization to achieve feature selection, and
the resultant formulation optimizes for selecting the most discriminative
features and removing the redundant ones simultaneously. The formulation is
extended to the -norm regularized case: which is more likely to
offer better sparsity when . Thus the formulation is a better
approximation to the feature selection problem. An efficient algorithm is
developed to solve the -norm based optimization problem and it is
proved that the algorithm converges when . Systematical experiments
are conducted to understand the work of the proposed method. Promising
experimental results on various types of real-world data sets demonstrate the
effectiveness of our algorithm
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Contribution to Graph-based Manifold Learning with Application to Image Categorization.
122 pLos algoritmos de aprendizaje de variedades basados en grafos (Graph,based manifold) son técnicas que han demostrado ser potentes herramientas para la extracción de caracterÃsticas y la reducción de la dimensionalidad en los campos de reconomiento de patrones, visión por computador y aprendizaje automático. Estos algoritmos utilizan información basada en las similitudes de pares de muestras y del grafo ponderado resultante para revelar la estructura geométrica intrÃnseca de la variedad
- …