8,832 research outputs found
An introduction to spectral distances in networks (extended version)
Many functions have been recently defined to assess the similarity among
networks as tools for quantitative comparison. They stem from very different
frameworks - and they are tuned for dealing with different situations. Here we
show an overview of the spectral distances, highlighting their behavior in some
basic cases of static and dynamic synthetic and real networks
Metric learning pairwise kernel for graph inference
Much recent work in bioinformatics has focused on the inference of various
types of biological networks, representing gene regulation, metabolic
processes, protein-protein interactions, etc. A common setting involves
inferring network edges in a supervised fashion from a set of high-confidence
edges, possibly characterized by multiple, heterogeneous data sets (protein
sequence, gene expression, etc.). Here, we distinguish between two modes of
inference in this setting: direct inference based upon similarities between
nodes joined by an edge, and indirect inference based upon similarities between
one pair of nodes and another pair of nodes. We propose a supervised approach
for the direct case by translating it into a distance metric learning problem.
A relaxation of the resulting convex optimization problem leads to the support
vector machine (SVM) algorithm with a particular kernel for pairs, which we
call the metric learning pairwise kernel (MLPK). We demonstrate, using several
real biological networks, that this direct approach often improves upon the
state-of-the-art SVM for indirect inference with the tensor product pairwise
kernel
Benchmarking network propagation methods for disease gene identification
In-silico identification of potential target genes for disease is an essential aspect of drug target discovery. Recent studies suggest that successful targets can be found through by leveraging genetic, genomic and protein interaction information. Here, we systematically tested the ability of 12 varied algorithms, based on network propagation, to identify genes that have been targeted by any drug, on gene-disease data from 22 common non-cancerous diseases in OpenTargets. We considered two biological networks, six performance metrics and compared two types of input gene-disease association scores. The impact of the design factors in performance was quantified through additive explanatory models. Standard cross-validation led to over-optimistic performance estimates due to the presence of protein complexes. In order to obtain realistic estimates, we introduced two novel protein complex-aware cross-validation schemes. When seeding biological networks with known drug targets, machine learning and diffusion-based methods found around 2-4 true targets within the top 20 suggestions. Seeding the networks with genes associated to disease by genetics decreased performance below 1 true hit on average. The use of a larger network, although noisier, improved overall performance. We conclude that diffusion-based prioritisers and machine learning applied to diffusion-based features are suited for drug discovery in practice and improve over simpler neighbour-voting methods. We also demonstrate the large impact of choosing an adequate validation strategy and the definition of seed disease genesPeer ReviewedPostprint (published version
Pathway-Based Genomics Prediction using Generalized Elastic Net.
We present a novel regularization scheme called The Generalized Elastic Net (GELnet) that incorporates gene pathway information into feature selection. The proposed formulation is applicable to a wide variety of problems in which the interpretation of predictive features using known molecular interactions is desired. The method naturally steers solutions toward sets of mechanistically interlinked genes. Using experiments on synthetic data, we demonstrate that pathway-guided results maintain, and often improve, the accuracy of predictors even in cases where the full gene network is unknown. We apply the method to predict the drug response of breast cancer cell lines. GELnet is able to reveal genetic determinants of sensitivity and resistance for several compounds. In particular, for an EGFR/HER2 inhibitor, it finds a possible trans-differentiation resistance mechanism missed by the corresponding pathway agnostic approach
Inferring a Transcriptional Regulatory Network from Gene Expression Data Using Nonlinear Manifold Embedding
Transcriptional networks consist of multiple regulatory layers corresponding to the activity of global regulators, specialized repressors and activators of transcription as well as proteins and enzymes shaping the DNA template. Such intrinsic multi-dimensionality makes uncovering connectivity patterns difficult and unreliable and it calls for adoption of methodologies commensurate with the underlying organization of the data source. Here we present a new computational method that predicts interactions between transcription factors and target genes using a compendium of microarray gene expression data and the knowledge of known interactions between genes and transcription factors. The proposed method called Kernel Embedding of REgulatory Networks (KEREN) is based on the concept of gene-regulon association and it captures hidden geometric patterns of the network via manifold embedding. We applied KEREN to reconstruct gene regulatory interactions in the model bacteria E.coli on a genome-wide scale. Our method not only yields accurate prediction of verifiable interactions, which outperforms on certain metrics comparable methodologies, but also demonstrates the utility of a geometric approach to the analysis of high-dimensional biological data. We also describe the general application of kernel embedding techniques to some other function and network discovery algorithms
Revisiting Date and Party Hubs: Novel Approaches to Role Assignment in Protein Interaction Networks
The idea of 'date' and 'party' hubs has been influential in the study of
protein-protein interaction networks. Date hubs display low co-expression with
their partners, whilst party hubs have high co-expression. It was proposed that
party hubs are local coordinators whereas date hubs are global connectors. Here
we show that the reported importance of date hubs to network connectivity can
in fact be attributed to a tiny subset of them. Crucially, these few, extremely
central, hubs do not display particularly low expression correlation,
undermining the idea of a link between this quantity and hub function. The
date/party distinction was originally motivated by an approximately bimodal
distribution of hub co-expression; we show that this feature is not always
robust to methodological changes. Additionally, topological properties of hubs
do not in general correlate with co-expression. Thus, we suggest that a
date/party dichotomy is not meaningful and it might be more useful to conceive
of roles for protein-protein interactions rather than individual proteins. We
find significant correlations between interaction centrality and the functional
similarity of the interacting proteins.Comment: 27 pages, 5 main figures, 4 supplementary figure
Extracting Biomolecular Interactions Using Semantic Parsing of Biomedical Text
We advance the state of the art in biomolecular interaction extraction with
three contributions: (i) We show that deep, Abstract Meaning Representations
(AMR) significantly improve the accuracy of a biomolecular interaction
extraction system when compared to a baseline that relies solely on surface-
and syntax-based features; (ii) In contrast with previous approaches that infer
relations on a sentence-by-sentence basis, we expand our framework to enable
consistent predictions over sets of sentences (documents); (iii) We further
modify and expand a graph kernel learning framework to enable concurrent
exploitation of automatically induced AMR (semantic) and dependency structure
(syntactic) representations. Our experiments show that our approach yields
interaction extraction systems that are more robust in environments where there
is a significant mismatch between training and test conditions.Comment: Appearing in Proceedings of the Thirtieth AAAI Conference on
Artificial Intelligence (AAAI-16
- …