Search CORE

34 research outputs found

Mining substructures in protein data

Author: Chang Elizabeth
Dillon Tharam S.
Hadzic Fedja
Sidhu Amandeep
Tan H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

In this paper we consider the 'Prions' database that describes protein instances stored for Human Prion Proteins. The Prions database can be viewed as a database of rooted ordered labeled subtrees. Mining frequent substructures from tree databases is an important task and it has gained a considerable amount of interest in areas such as XML mining, Bioinformatics, Web mining etc. This has given rise to the development of many tree mining algorithms which can aid in structural comparisons, association rule discovery and in general mining of tree structured knowledge representations. Previously we have developed the MB3 tree mining algorithm, which given a minimum support threshold, efficiently discovers all frequent embedded subtrees from a database of rooted ordered labeled subtrees. In this work we apply the algorithm to the Prions database in order to extract the frequently occurring patterns, which in this case are of induced subtree type. Obtaining the set of frequent induced subtrees from the Prions database can potentially reveal some useful knowledge. This aspect will be demonstrated by providing an analysis of the extracted frequent subtrees with respect to discovering interesting protein information. Furthermore, the minimum support threshold can be used as the controlling factor for answering specific queries posed on the Prions dataset. This approach is shown to be a viable technique for mining protein data

CiteSeerX

espace@Curtin

Algebraic comparison of meta bolic networks, phylogenetic inference, and metabolic innovation

Author: Flamm Christoph
Forst Christian V.
Hofacker Ivo L.
Stadler Peter F.
Publication venue
Publication date: 14/12/2018
Field of study

Metabolic networks are naturally represented as directed hypergraphs in such a way that metabolites are nodes and enzyme-catalyzed reactions form (hyper)edges. The familiar operations from set algebra (union, intersection, and difference) form a natural basis for both the pairwise comparison of networks and identification of distinct metabolic features of a set of algorithms. We report here on an implementation of this approach and its application to the procaryotes. We demonstrate that metabolic networks contain valuable phylogenetic information by comparing phylogenies obtained from network comparisons with 16S RNA phylogenies. We then used the same software to study metabolic innovations in two sets of organisms, free living microbes and Pyrococci, as well as obligate intracellular pathogens

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig

Topological network alignment uncovers biological function and phylogeny

Author: Cook S.
Flannick J.
Kuchaiev O.
Kuchaiev O.
Memišević V.
Nataša Pržulj
Oleksii Kuchaiev
Pržulj N.
Singh R.
Singh R.
Snijders T. A.
Tijana Milenković
Vesna Memišević
Wayne Hayes
Wentz-Hunter K.
Zhang Y.
Publication venue
Publication date: 07/10/2009
Field of study

Sequence comparison and alignment has had an enormous impact on our understanding of evolution, biology, and disease. Comparison and alignment of biological networks will likely have a similar impact. Existing network alignments use information external to the networks, such as sequence, because no good algorithm for purely topological alignment has yet been devised. In this paper, we present a novel algorithm based solely on network topology, that can be used to align any two networks. We apply it to biological networks to produce by far the most complete topological alignments of biological networks to date. We demonstrate that both species phylogeny and detailed biological function of individual proteins can be extracted from our alignments. Topology-based alignments have the potential to provide a completely new, independent source of phylogenetic information. Our alignment of the protein-protein interaction networks of two very different species--yeast and human--indicate that even distant species share a surprising amount of network topology with each other, suggesting broad similarities in internal cellular wiring across all life on Earth.Comment: Algorithm explained in more details. Additional analysis adde

arXiv.org e-Print Archive

Crossref

PubMed Central

UCL Discovery

Algebraic comparison of metabolic networks, phylogenetic inference, and metabolic innovation

Author: Flamm Christoph
Forst Christian V
Hofacker Ivo L
Stadler Peter F
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Comparison of metabolic networks is typically performed based on the organisms' enzyme contents. This approach disregards functional replacements as well as orthologies that are misannotated. Direct comparison of the structure of metabolic networks can circumvent these problems. RESULTS: Metabolic networks are naturally represented as directed hypergraphs in such a way that metabolites are nodes and enzyme-catalyzed reactions form (hyper)edges. The familiar operations from set algebra (union, intersection, and difference) form a natural basis for both the pairwise comparison of networks and identification of distinct metabolic features of a set of algorithms. We report here on an implementation of this approach and its application to the procaryotes. CONCLUSION: We demonstrate that metabolic networks contain valuable phylogenetic information by comparing phylogenies obtained from network comparisons with 16S RNA phylogenies. The algebraic approach to metabolic networks is suitable to study metabolic innovations in two sets of organisms, free living microbes and Pyrococci, as well as obligate intracellular pathogens

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Reconstruction of phyletic trees by global alignment of multiple metabolic networks

Author: Berger Bonnie
Lee Chi-Ching
Liao Chung-Shou
Lin Shu-Hsi
Ma Cheng-Yu
Tang Chuan Yi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Background: In the last decade, a considerable amount of research has been devoted to investigating the phylogenetic properties of organisms from a systems-level perspective. Most studies have focused on the classification of organisms based on structural comparison and local alignment of metabolic pathways. In contrast, global alignment of multiple metabolic networks complements sequence-based phylogenetic analyses and provides more comprehensive information. Results: We explored the phylogenetic relationships between microorganisms through global alignment of multiple metabolic networks. The proposed approach integrates sequence homology data with topological information of metabolic networks. In general, compared to recent studies, the resulting trees reflect the living style of organisms as well as classical taxa. Moreover, for phylogenetically closely related organisms, the classification results are consistent with specific metabolic characteristics, such as the light-harvesting systems, fermentation types, and sources of electrons in photosynthesis. Conclusions: We demonstrate the usefulness of global alignment of multiple metabolic networks to infer phylogenetic relationships between species. In addition, our exhaustive analysis of microbial metabolic pathways reveals differences in metabolic features between phylogenetically closely related organisms. With the ongoing increase in the number of genomic sequences and metabolic annotations, the proposed approach will help identify phenotypic variations that may not be apparent based solely on sequence-based classification.National Institutes of Health (U.S.) (Grant GM081871

DSpace@MIT

Crossref

Springer - Publisher Connector

PubMed Central

Software Verification and Graph Similarity for Automated Evaluation of Students' Assignments

Author: Kuncak Viktor
Nikolic Mladen
Tosic Dusan
Vujosevic-Janicic Milena
Publication venue
Publication date: 29/06/2012
Field of study

In this paper we promote introducing software verification and control flow graph similarity measurement in automated evaluation of students' programs. We present a new grading framework that merges results obtained by combination of these two approaches with results obtained by automated testing, leading to improved quality and precision of automated grading. These two approaches are also useful in providing a comprehensible feedback that can help students to improve the quality of their programs We also present our corresponding tools that are publicly available and open source. The tools are based on LLVM low-level intermediate code representation, so they could be applied to a number of programming languages. Experimental evaluation of the proposed grading framework is performed on a corpus of university students' programs written in programming language C. Results of the experiments show that automatically generated grades are highly correlated with manually determined grades suggesting that the presented tools can find real-world applications in studying and grading

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Optimized ancestral state reconstruction using Sankoff parsimony

Author: A Mazurie
AWF Edwards
B Kolaczkowski
BA Malcolm
BS Gaut
CV Forst
D Sankoff
D Sankoff
DA Shagin
DE Knuth
DL Swofford
DS Gladstein
F Ronquist
Gabriel Valiente
H Akashi
HW Ma
J Felsenstein
J Felsenstein
J Felsenstein
J Ma
J Wang
J Zhang
JC Clemente
José C Clemente
JP Huelsenbeck
JT Bridgham
JW Thornton
K Fan
Kazuho Ikeo
LR Murphy
M Cieplak
M Heymans
M Kanehisa
M Kimura
MK Kuhner
MS Waterman
N Saitou
NB Adey
NM Krishnan
PA Goloboff
PA Goloboff
PHA Sneath
RF Smith
T Tanaka
Takashi Gojobori
TH Jukes
WC Liu
WC Wheeler
WM Fitch
Y Inagaki
Y Tohsato
Z Jiang
ZS Yang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Parsimony methods are widely used in molecular evolution to estimate the most plausible phylogeny for a set of characters. Sankoff parsimony determines the minimum number of changes required in a given phylogeny when a cost is associated to transitions between character states. Although optimizations exist to reduce the computations in the number of taxa, the original algorithm takes time <it>O</it>(<it>n</it>2) in the number of states, making it impractical for large values of <it>n</it>. Results In this study we introduce an optimization of Sankoff parsimony for the reconstruction of ancestral states when ultrametric or additive cost matrices are used. We analyzed its performance for randomly generated matrices, Jukes-Cantor and Kimura's two-parameter models of DNA evolution, and in the reconstruction of elongation factor-1<it>α </it>and ancestral metabolic states of a group of eukaryotes, showing that in all cases the execution time is significantly less than with the original implementation. Conclusion The algorithms here presented provide a fast computation of Sankoff parsimony for a given phylogeny. Problems where the number of states is large, such as reconstruction of ancestral metabolism, are particularly adequate for this optimization. Since we are reducing the computations required to calculate the parsimony cost of a single tree, our method can be combined with optimizations in the number of taxa that aim at finding the most parsimonious tree.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Comparative classification of species and the study of pathway evolution based on the alignment of metabolic pathways

Author: Béjà Oded
Mano Adi
Pinter Ron Y
Tuller Tamir
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

PubMed Central