Search CORE

1,718 research outputs found

Incorporating diverse data to improve genetic network alignment with IsoRank

Author: Eisner Eric David
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2011
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 26).To more accurately predict which genes from different species have the same function (orthologs), I extend the network-alignment algorithm IsoRank to simultaneously align multiple unrelated networks over the same set of nodes. In addition to the original protein-interaction networks, I align genetic-interaction networks, gene-expression correlations, and chromosome localization data to improve the functional similarity of aligned genes. Alignments are evaluated with consistency measurements of protein function within ortholog clusters, and with an information-retrieval statistic from a small set of known orthologs. Integrating these additional types of data is shown to improve IsoRank's predictions of classes of genes that have sparse coverage in the original protein-interaction networks.by Eric David Eisner.M.Eng

DSpace@MIT

Functionally guided alignment of protein interaction networks for module detection

Author: Ali Waqar
Deane Charlotte M.
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Functional module detection within protein interaction networks is a challenging problem due to the sparsity of data and presence of errors. Computational techniques for this task range from purely graph theoretical approaches involving single networks to alignment of multiple networks from several species. Current network alignment methods all rely on protein sequence similarity to map proteins across species

PubMed Central

Oxford University Research Archive

IsoBase: a database of functionally related proteins across PPI networks

Author: B. Berger
C.-S. Liao
Chen
D. Park
Koonin
M. Baym
O'Brien
R. Singh
Salwinski
Sharan
Tatusov
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/10/2010
Field of study

We describe IsoBase, a database identifying functionally related proteins, across five major eukaryotic model organisms: Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus and Homo Sapiens. Nearly all existing algorithms for orthology detection are based on sequence comparison. Although these have been successful in orthology prediction to some extent, we seek to go beyond these methods by the integration of sequence data and protein–protein interaction (PPI) networks to help in identifying true functionally related proteins. With that motivation, we introduce IsoBase, the first publicly available ortholog database that focuses on functionally related proteins. The groupings were computed using the IsoRankN algorithm that uses spectral methods to combine sequence and PPI data and produce clusters of functionally related proteins. These clusters compare favorably with those from existing approaches: proteins within an IsoBase cluster are more likely to share similar Gene Ontology (GO) annotation. A total of 48 120 proteins were clustered into 12 693 functionally related groups. The IsoBase database may be browsed for functionally related proteins across two or more species and may also be queried by accession numbers, species-specific identifiers, gene name or keyword. The database is freely available for download at http://isobase.csail.mit.edu/.National Institute of General Medical Sciences (U.S.) (Grant Number 1R01GM081871)Fannie and John Hertz FoundationNational Science Foundation (U.S.) (NSF MSPRF)National Science Council of Taiwan (NSC99-2218-E-007-010)National Institutes of Health (U.S.) (1R01GM081871

A multi-species functional embedding integrating sequence and network structure

Author: Cannistra Anthony
Crovella Mark
Fan Jason
Fried Inbar
Hescott Benjamin
Leiserson Mark D. M.
Lim Tim
Schaffner Thomas
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/04/2018
Field of study

A key challenge to transferring knowledge between species is that different species have fundamentally different genetic architectures. Initial computational approaches to transfer knowledge across species have relied on measures of heredity such as genetic homology, but these approaches suffer from limitations. First, only a small subset of genes have homologs, limiting the amount of knowledge that can be transferred, and second, genes change or repurpose functions, complicating the transfer of knowledge. Many approaches address this problem by expanding the notion of homology by leveraging high-throughput genomic and proteomic measurements, such as through network alignment. In this work, we take a new approach to transferring knowledge across species by expanding the notion of homology through explicit measures of functional similarity between proteins in different species. Specifically, our kernel-based method, HANDL (Homology Assessment across Networks using Diffusion and Landmarks), integrates sequence and network structure to create a functional embedding in which proteins from different species are embedded in the same vector space. We show that inner products in this space and the vectors themselves capture functional similarity across species, and are useful for a variety of functional tasks. We perform the first whole-genome method for predicting phenologs, generating many that were previously identified, but also predicting new phenologs supported from the biological literature. We also demonstrate the HANDL embedding captures pairwise gene function, in that gene pairs with synthetic lethal interactions are significantly separated in HANDL space, and the direction of separation is conserved across species. Software for the HANDL algorithm is available at http://bit.ly/lrgr-handl.Published versio

Boston University Institutional Repository (OpenBU)

OrthoClust: An Orthology-Based Network Framework for Clustering Data Across Multiple Species

Author: Cheng Chao
Gerstein Mark Gerstein
Rozowsky Joel
Wang Daifeng
Yan Koon-Kiu
Zheng Henry
Publication venue: Dartmouth Digital Commons
Publication date: 01/01/2014
Field of study

Increasingly, high-dimensional genomics data are becoming available for many organisms.Here, we develop OrthoClust for simultaneously clustering data across multiple species. OrthoClust is a computational framework that integrates the co-association networks of individual species by utilizing the orthology relationships of genes between species. It outputs optimized modules that are fundamentally cross-species, which can either be conserved or species-specific. We demonstrate the application of OrthoClust using the RNA-Seq expression profiles of Caenorhabditis elegans and Drosophila melanogaster from the modENCODE consortium. A potential application of cross-species modules is to infer putative analogous functions of uncharacterized elements like non-coding RNAs based on guilt-by-association

Springer - Publisher Connector

PubMed Central

Dartmouth Digital Commons (Dartmouth College)

OrthoClust: An Orthology-Based Network Framework for Clustering Data Across Multiple Species

Author: Cheng Chao
Gerstein Mark Gerstein
Rozowsky Joel
Wang Daifeng
Yan Koon-Kiu
Zheng Henry
Publication venue: Dartmouth Digital Commons
Publication date: 28/08/2014
Field of study

Dartmouth Digital Commons (Dartmouth College)