Search CORE

FigShare

IgTM: An algorithm to predict transmembrane domains and topology in proteins

Author: B Mathews
C Pasquier
D Angluin
D Angluin
D Lopez
D Lopez
Damián López
DB Searls
DT Jones
E Wallin
EE Pashou
ELL Sonnhammer
EM Gold
GE Tusnády
H Viklund
J Berstel
JE Hopcroft
JM Sempere
L Käll
LR Murphy
M Burset
M Ikeda
M Punta
Marcelino Campos
MM Gromiha
NS Sadovskaya
P Fariselli
P García
P Peris
PG Bagos
Piedachu Peris
R B
S Jayasinghe
S Mitaku
S Möller
T Knuutila
T Li
T Yokomori
T Yokomori
Publication venue: BioMed Central
Publication date: 01/09/2008
Field of study

Abstract Background Due to their role of receptors or transporters, membrane proteins play a key role in many important biological functions. In our work we used Grammatical Inference (GI) to localize transmembrane segments. Our GI process is based specifically on the inference of Even Linear Languages. Results We obtained values close to 80% in both specificity and sensitivity. Six datasets have been used for the experiments, considering different encodings for the input sequences. An encoding that includes the topology changes in the sequence (from inside and outside the membrane to it and vice versa) allowed us to obtain the best results. This software is publicly available at: <url>http://www.dsic.upv.es/users/tlcc/bio/bio.html</url> Conclusion We compared our results with other well-known methods, that obtain a slightly better precision. However, this work shows that it is possible to apply Grammatical Inference techniques in an effective way to bioinformatics problems.</p

Springer - Publisher Connector

Modeling Structure-Function Relationships in Synthetic DNA Sequences using Attribute Grammars

Recognizing that certain biological functions can be associated with specific DNA sequences has led various fields of biology to adopt the notion of the genetic part. This concept provides a finer level of granularity than the traditional notion of the gene. However, a method of formally relating how a set of parts relates to a function has not yet emerged. Synthetic biology both demands such a formalism and provides an ideal setting for testing hypotheses about relationships between DNA sequences and phenotypes beyond the gene-centric methods used in genetics. Attribute grammars are used in computer science to translate the text of a program source code into the computational operations it represents. By associating attributes with parts, modifying the value of these attributes using rules that describe the structure of DNA sequences, and using a multi-pass compilation process, it is possible to translate DNA sequences into molecular interaction network models. These capabilities are illustrated by simple example grammars expressing how gene expression rates are dependent upon single or multiple parts. The translation process is validated by systematically generating, translating, and simulating the phenotype of all the sequences in the design space generated by a small library of genetic parts. Attribute grammars represent a flexible framework connecting parts with models of biological function. They will be instrumental for building mathematical models of libraries of genetic constructs synthesized to characterize the function of genetic parts. This formalism is also expected to provide a solid foundation for the development of computer assisted design applications for synthetic biology

Edinburgh Research Explorer

The University of Manchester - Institutional Repository

Directed acyclic graph kernels for structural RNA analysis

Author: B Knudsen
B Schölkopf
CB Do
D Haussler
D Sankoff
DB Searls
DM Tax
E Rivas
EK Freyhult
H Kiryu
H Saigo
I Holmes
IL Hofacker
IL Hofacker
J Hertel
J Hertel
JD Thompson
JS McCaskill
JS Pedersen
JW Brown
K Sato
Kengo Sato
Kiyoshi Asai
MA Rosenblad
P Pacheco
RD Dowell
RE Fan
RJ Klein
S Washietl
S Washietl
S Will
SR Eddy
SR Eddy
SR Eddy
T Babak
T Kin
Toutai Mituyama
W Deng
Y Sakakibara
Y Sakakibara
Y Sakakibara
Yasubumi Sakakibara
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. Results We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. Conclusion Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.</p

Springer - Publisher Connector

Large-Scale Phylogenetic Analysis of Emerging Infectious Diseases

Author: A Moilanen
A Phillips
A Tehler
AR Lemmon
B Budowle
B Chang
B Grenfell
B Rannala
B Rannala
BD Redelings
BE Martina
C Ceron
C Scholtissek
D Earn
D Franz
D Janies
D Janies
D Morrison
D Pol
D Sankoff
D Searls
DJ Zwickl
DL Swofford
DL Swofford
DL Swofford
DM Hillis
DM Hillis
E Ghedin
E Holmes
E Ukkonen
EM Rubin
G Laver
H Song
J Antonovics
J Felsenstein
J Felsenstein
J Felsenstein
J Huelsenbeck
J Plotkin
J Silvertown
J Thornton
JD Thompson
JK Taubenberger
JK Taubenberger
JK Taubenberger
JK Taubenberger
JL Thorne
JP Carulli
JS Farris
JS Farris
JS Farris
K Li
K Li
K Ungchusak
KC Nixon
KC Nixon
KP White
L Wang
L Watrous
LA Salter
LH Taylor
LR Foulds
M Gammelin
M Gibbs
M Koopmans
M Metzker
MA Charleston
MA Marra
MD Hendy
MJ Brauer
N Saitou
NM Ferguson
NM Ferguson
P Palese
PA Goloboff
PA Goloboff
PA Rota
PO Lewis
Q Wang
R Fleissner
RG Webster
RM Bush
RM Bush
RM Bush
RS Ross
S Lau
S Li
S Morse
S Poe
T Fanning
T Grant
T Ksiazek
The Chinese SARS Molecular Epidemiology Consortium
W Hennig
W Li
W Wheeler
W Wheeler
WC Wheeler
WC Wheeler
WM Fitch
WM Fitch
WM Fitch
Y Guan
Y Guan
Y Lin
Y Suzuki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Microorganisms that cause infectious diseases present critical issues of national security, public health, and economic welfare. For example, in recent years, highly pathogenic strains of avian influenza have emerged in Asia, spread through Eastern Europe and threaten to become pandemic. As demonstrated by the coordinated response to Severe Acute Respiratory Syndrome (SARS) and influenza, agents of infectious disease are being addressed via large-scale genomic sequencing. The goal of genomic sequencing projects are to rapidly put large amounts of data in the public domain to accelerate research on disease surveillance, treatment, and prevention. However, our ability to derive information from large comparative genomic datasets lags far behind acquisition. Here we review the computational challenges of comparative genomic analyses, specifically sequence alignment and reconstruction of phylogenetic trees. We present novel analytical results on from two important infectious diseases, Severe Acute Respiratory Syndrome (SARS) and influenza.SARS and influenza have similarities and important differences both as biological and comparative genomic analysis problems. Influenza viruses (Orthymxyoviridae) are RNA based. Current evidence indicates that influenza viruses originate in aquatic birds from wild populations. Influenza has been studied for decades via well-coordinated international efforts. These efforts center on surveillance via antibody characterization of the hemagglutinin (HA) and neuraminidase (N) proteins of the circulating strains to inform vaccine design. However we still do not have a clear understanding of: 1) various transmission pathways such as the role of intermediate hosts such as swine and domestic birds and 2) the key mutation and genomic recombination events that underlie periodic pandemics of influenza. In the past 30 years, sequence data from HA and N loci has become an important data type. In the past year, full genomic data has become prominent. These data present exciting opportunities to address unanswered questions in influenza pandemics.SARS is caused by a previously unrecognized lineage of coronavirus, SARS-CoV, which like influenza has an RNA based genome. Although SARS-CoV is widely believed to have originated in animals there remains disagreement over the candidate animal source that lead to the original outbreak of SARS. In contrast to the long history of the study of influenza, SARS was only recognized in late 2002 and the virus that causes SARS has been documented primarily by genomic sequencing.In the past, most studies of influenza were performed on a limited number of isolates and genes suited to a particular problem. Major goals in science today are to understand emerging diseases in broad geographic, environmental, societal, biological, and genomic contexts. Synthesizing diverse information brought together by various researchers is important to find out what can be done to prevent future outbreaks {JON03}. Thus comprehensive means to organize and analyze large amounts of diverse information are critical. For example, the relationships of isolates and patterns of genomic change observed in large datasets might not be consistent with hypotheses formed on partial data. Moreover when researchers rely on partial datasets, they restrict the range of possible discoveries.Phylogenetics is well suited to the complex task of understanding emerging infectious disease. Phylogenetic analyses can test many hypotheses by comparing diverse isolates collected from various hosts, environments, and points in time and organizing these data into various evolutionary scenarios. The products of a phylogenetic analysis are a graphical tree of ancestor-descendent relationships and an inferred summary of mutations, recombination events, host shifts, geographic, and temporal spread of the viruses. However, this synthesis comes at a price. The cost of computation of phylogenetic analysis expands combinatorially as the number of isolates considered increases. Thus, large datasets like those currently produced are commonly considered intractable. We address this problem with synergistic development of heuristics tree search strategies and parallel computing.Fil: Janies, D.. Ohio State University; Estados UnidosFil: Pol, Diego. Ohio State University; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Category Theoretic Analysis of Hierarchical Protein Materials and Social Networks

Author: A Fritsch
AL Barabasi
AL Barabasi
B Alberts
BC Pierce
CM Schneider
D Eisenberg
D Taylor
DA Fletcher
David I. Spivak
DB Searls
DI Spivak
E Moggi
E Rodriguez
Elizabeth Wood
EM Marcotte
EM Marcotte
FW Lawvere
GB Olson
H Jeong
H Jeong
H Peterlik
I Lee
J Aizenberg
J Verdasca
JD Currey
K Hofstetter
Laurent Kreplak
M Barr
M Moortgat
Markus J. Buehler
MD Hauser
MJ Buehler
MJ Buehler
MS Szalay
N Chomsky
N Huebsch
NM Pugno
O Mason
P Csermely
P Fratzl
P Nurse
P Wadler
R Brown
R Lakes
R Milo
R Paparcone
R Pastor-Satorras
RC Strohman
RT Oehrle
S Awodey
S Eilenberg
S Keten
SM Lane
SW Cranford
T Ackbarow
Tristan Giesa
WW Powell
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Materials in biology span all the scales from Angstroms to meters and typically consist of complex hierarchical assemblies of simple building blocks. Here we describe an application of category theory to describe structural and resulting functional properties of biological protein materials by developing so-called ologs. An olog is like a “concept web” or “semantic network” except that it follows a rigorous mathematical formulation based on category theory. This key difference ensures that an olog is unambiguous, highly adaptable to evolution and change, and suitable for sharing concepts with other olog. We consider simple cases of beta-helical and amyloid-like protein filaments subjected to axial extension and develop an olog representation of their structural and resulting mechanical properties. We also construct a representation of a social network in which people send text-messages to their nearest neighbors and act as a team to perform a task. We show that the olog for the protein and the olog for the social network feature identical category-theoretic representations, and we proceed to precisely explicate the analogy or isomorphism between them. The examples presented here demonstrate that the intrinsic nature of a complex system, which in particular includes a precise relationship between structure and function at different hierarchical levels, can be effectively represented by an olog. This, in turn, allows for comparative studies between disparate materials or fields of application, and results in novel approaches to derive functionality in the design of de novo hierarchical systems. We discuss opportunities and challenges associated with the description of complex biological materials by using ologs as a powerful tool for analysis and design in the context of materiomics, and we present the potential impact of this approach for engineering, life sciences, and medicine.Presidential Early Career Award for Scientists and Engineers (N000141010562)United States. Army Research Office. Multidisciplinary University Research Initiative (W911NF0910541)United States. Office of Naval Research (grant N000141010841)Massachusetts Institute of Technology. Dept. of MathematicsStudienstiftung des deutschen VolkesClark BarwickJacob Luri

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Publikationsserver der RWTH Aachen University

The Roots of Bioinformatics in Theoretical Biology

Author: A Anderson
A Boiteux
A Crombach
A Crombach
A Lindenmayer
A Lindenmayer
A Marée
A Marée
A Neyfakh
A Turing
A Varma
A Wagner
A Wagner
B Goodwin
B Hesper
B Turner
C Hewitt
C Honk
C Pál
CH Waddington
D Gillespie
D Konings
D Konings
David B. Searls
E Koonin
E van Nimwegen
EP Odum
F Crick
F Graner
F Rosenblatt
FK de Boer
G Lance
G Odell
H Abelson
H Kacser
J Draghi
J Draghi
J Griffith
J Hagen
J Holland
L Hurst
L Segel
L Von Bertalanffy
L Von Bertalanffy
L Wolpert
M Boerlijst
M Covert
M Dayhoff
M Dayhoff
M Huynen
M Huynen
M Huynen
M Kertesz
M Kozak
M Minsky
M Szekely
M Thomson
M Van Hoek
N Batada
N Savill
N Stoletzki
O Mastenbroek
O Soyer
P Hogeweg
P Hogeweg
P Hogeweg
P Hogeweg
P Hogeweg
P Hogeweg
P Hogeweg
P Hogeweg
P Hogeweg
P Hogeweg
P Hogeweg
P Macnaughton-Smith
P Nurse
P Schuster
P Sneath
Paulien Hogeweg
R Goldstein
R May
R Rosen
S Freilich
S Huang
S Huang
S Kauffman
S Papert
S Rafelski
W Gu
Publication venue: Public Library of Science
Publication date: 01/03/2011
Field of study

From the late 1980s onward, the term “bioinformatics” mostly has been used to refer to computational methods for comparative analysis of genome data. However, the term was originally more widely defined as the study of informatic processes in biotic systems. In this essay, I will trace this early history (from a personal point of view) and I will argue that the original meaning of the term is re-emerging

Diabetes Alters Intracellular Calcium Transients in Cardiac Endothelial Cells

Diabetic cardiomyopathy (DCM) is a diabetic complication, which results in myocardial dysfunction independent of other etiological factors. Abnormal intracellular calcium ([Ca2+]i) homeostasis has been implicated in DCM and may precede clinical manifestation. Studies in cardiomyocytes have shown that diabetes results in impaired [Ca2+]i homeostasis due to altered sarcoplasmic reticulum Ca2+ ATPase (SERCA) and sodium-calcium exchanger (NCX) activity. Importantly, altered calcium homeostasis may also be involved in diabetes-associated endothelial dysfunction, including impaired endothelium-dependent relaxation and a diminished capacity to generate nitric oxide (NO), elevated cell adhesion molecules, and decreased angiogenic growth factors. However, the effect of diabetes on Ca2+ regulatory mechanisms in cardiac endothelial cells (CECs) remains unknown. The objective of this study was to determine the effect of diabetes on [Ca2+]i homeostasis in CECs in the rat model (streptozotocin-induced) of DCM. DCM-associated cardiac fibrosis was confirmed using picrosirius red staining of the myocardium. CECs isolated from the myocardium of diabetic and wild-type rats were loaded with Fura-2, and UTP-evoked [Ca2+]i transients were compared under various combinations of SERCA, sarcoplasmic reticulum Ca2+ ATPase (PMCA) and NCX inhibitors. Diabetes resulted in significant alterations in SERCA and NCX activities in CECs during [Ca2+]i sequestration and efflux, respectively, while no difference in PMCA activity between diabetic and wild-type cells was observed. These results improve our understanding of how diabetes affects calcium regulation in CECs, and may contribute to the development of new therapies for DCM treatment

Graph Transformation in Molecular Biology

Author: A. Cayley
A. Ehrenfeucht
A. Kister
A. Regev
A. Regev
A.M. Lesk
B. Mayoh
C. Flamm
C. Reidys
D. Gernert
D. Searls
D. Searls
D. Weininger
D. Weininger
D. Weininger
D. Westhead
E. Rivas
F. Fringuelli
F. Rosselló
G. Benkö
G. Benkö
G. Benkö
H. Ehrig
H. Kitano
H. McAdams
H. Seo
H.S. Chan
I. Hofacker
I. Zevedei-Oancea
J. McCaskill
J. Richardson
J. Schultz
J.M. Bower
K. Culik II
K. Tomita
L.T. Fan
M. Beck
M. Curti
M. Kanehisa
M. Zuker
M.K. Yadav
M.S. Waterman
N. Abe
O. Polanski
P. Dittrich
P. Goss
P. Speroni
R. Durbin
R. Heckel
S. Eker
S. Fujita
S. Fujita
T. Przytycka
V. Brendel
V. Danos
V. Danos
V. Danos
W. Fontana
Y. Deville
Y. Sakakibara
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

In the beginning, one of the main fields of application of graph transformation was biology, and more specifically morphology. Later, however, it was like if the biological applications had been left aside by the graph transformation community, just to be moved back into the mainstream these very last years with a new interest in molecular biology. In this paper, we review several fields of application of graph grammars in molecular biology, including: the modeling higherdimensional structures of biomolecules, the description of biochemical reactions, the analysis of metabolic pathways, and their potential use in computational systems biology

CiteSeerX

Disease-Aging Network Reveals Significant Roles of Aging Genes in Connecting Genetic Diseases

Author: A Budovsky
A Budovsky
A Friedman
A Kowald
A Kriete
A Ozgur
AL Barabasi
C Soti
D Harman
David B. Searls
DJ Watts
E Ravasz
G Jin
GRG Lanckriet
H Kitano
H Xue
HD Osiewacz
HJ Kiss
I Feldman
J Hasty
JDJ Han
Jiguang Wang
JP de Magalhaes
JP de Magalhaes
JR Managbanag
KI Goh
L Hayflick
Luonan Chen
M Wolfson
MEJ Newman
P Shannon
P Zuppan
PF Jonsson
Q Cui
R Albert
R Bell
RI Kondor
S Karni
S Maere
S Maslov
S Peri
S Vasto
Shihua Zhang
T Ideker
T Ishunina
TBL Kirkwood
U Brandes
U Stelzl
X Jiang
X Wu
Xiang-Sun Zhang
Y Li
Yong Wang
Z Spiro
Z Tu
Publication venue: Public Library of Science
Publication date: 01/09/2009
Field of study

One of the challenging problems in biology and medicine is exploring the underlying mechanisms of genetic diseases. Recent studies suggest that the relationship between genetic diseases and the aging process is important in understanding the molecular mechanisms of complex diseases. Although some intricate associations have been investigated for a long time, the studies are still in their early stages. In this paper, we construct a human disease-aging network to study the relationship among aging genes and genetic disease genes. Specifically, we integrate human protein-protein interactions (PPIs), disease-gene associations, aging-gene associations, and physiological system–based genetic disease classification information in a single graph-theoretic framework and find that (1) human disease genes are much closer to aging genes than expected by chance; and (2) diseases can be categorized into two types according to their relationships with aging. Type I diseases have their genes significantly close to aging genes, while type II diseases do not. Furthermore, we examine the topological characters of the disease-aging network from a systems perspective. Theoretical results reveal that the genes of type I diseases are in a central position of a PPI network while type II are not; (3) more importantly, we define an asymmetric closeness based on the PPI network to describe relationships between diseases, and find that aging genes make a significant contribution to associations among diseases, especially among type I diseases. In conclusion, the network-based study provides not only evidence for the intricate relationship between the aging process and genetic diseases, but also biological implications for prying into the nature of human diseases