133 research outputs found

    IgTM: An algorithm to predict transmembrane domains and topology in proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Due to their role of receptors or transporters, membrane proteins play a key role in many important biological functions. In our work we used Grammatical Inference (GI) to localize transmembrane segments. Our GI process is based specifically on the inference of Even Linear Languages.</p> <p>Results</p> <p>We obtained values close to 80% in both specificity and sensitivity. Six datasets have been used for the experiments, considering different encodings for the input sequences. An encoding that includes the topology changes in the sequence (from inside and outside the membrane to it and vice versa) allowed us to obtain the best results. This software is publicly available at: <url>http://www.dsic.upv.es/users/tlcc/bio/bio.html</url></p> <p>Conclusion</p> <p>We compared our results with other well-known methods, that obtain a slightly better precision. However, this work shows that it is possible to apply Grammatical Inference techniques in an effective way to bioinformatics problems.</p

    Modeling Structure-Function Relationships in Synthetic DNA Sequences using Attribute Grammars

    Get PDF
    Recognizing that certain biological functions can be associated with specific DNA sequences has led various fields of biology to adopt the notion of the genetic part. This concept provides a finer level of granularity than the traditional notion of the gene. However, a method of formally relating how a set of parts relates to a function has not yet emerged. Synthetic biology both demands such a formalism and provides an ideal setting for testing hypotheses about relationships between DNA sequences and phenotypes beyond the gene-centric methods used in genetics. Attribute grammars are used in computer science to translate the text of a program source code into the computational operations it represents. By associating attributes with parts, modifying the value of these attributes using rules that describe the structure of DNA sequences, and using a multi-pass compilation process, it is possible to translate DNA sequences into molecular interaction network models. These capabilities are illustrated by simple example grammars expressing how gene expression rates are dependent upon single or multiple parts. The translation process is validated by systematically generating, translating, and simulating the phenotype of all the sequences in the design space generated by a small library of genetic parts. Attribute grammars represent a flexible framework connecting parts with models of biological function. They will be instrumental for building mathematical models of libraries of genetic constructs synthesized to characterize the function of genetic parts. This formalism is also expected to provide a solid foundation for the development of computer assisted design applications for synthetic biology

    Directed acyclic graph kernels for structural RNA analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity.</p> <p>Results</p> <p>We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering.</p> <p>Conclusion</p> <p>Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.</p

    Category Theoretic Analysis of Hierarchical Protein Materials and Social Networks

    Get PDF
    Materials in biology span all the scales from Angstroms to meters and typically consist of complex hierarchical assemblies of simple building blocks. Here we describe an application of category theory to describe structural and resulting functional properties of biological protein materials by developing so-called ologs. An olog is like a “concept web” or “semantic network” except that it follows a rigorous mathematical formulation based on category theory. This key difference ensures that an olog is unambiguous, highly adaptable to evolution and change, and suitable for sharing concepts with other olog. We consider simple cases of beta-helical and amyloid-like protein filaments subjected to axial extension and develop an olog representation of their structural and resulting mechanical properties. We also construct a representation of a social network in which people send text-messages to their nearest neighbors and act as a team to perform a task. We show that the olog for the protein and the olog for the social network feature identical category-theoretic representations, and we proceed to precisely explicate the analogy or isomorphism between them. The examples presented here demonstrate that the intrinsic nature of a complex system, which in particular includes a precise relationship between structure and function at different hierarchical levels, can be effectively represented by an olog. This, in turn, allows for comparative studies between disparate materials or fields of application, and results in novel approaches to derive functionality in the design of de novo hierarchical systems. We discuss opportunities and challenges associated with the description of complex biological materials by using ologs as a powerful tool for analysis and design in the context of materiomics, and we present the potential impact of this approach for engineering, life sciences, and medicine.Presidential Early Career Award for Scientists and Engineers (N000141010562)United States. Army Research Office. Multidisciplinary University Research Initiative (W911NF0910541)United States. Office of Naval Research (grant N000141010841)Massachusetts Institute of Technology. Dept. of MathematicsStudienstiftung des deutschen VolkesClark BarwickJacob Luri

    Oxidant-NO dependent gene regulation in dogs with type I diabetes: impact on cardiac function and metabolism

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The mechanisms responsible for the cardiovascular mortality in type I diabetes (DM) have not been defined completely. We have shown in conscious dogs with DM that: <it>1</it>) baseline coronary blood flow (CBF) was significantly decreased, <it>2</it>) endothelium-dependent (ACh) coronary vasodilation was impaired, and <it>3</it>) reflex cholinergic NO-dependent coronary vasodilation was selectively depressed. The most likely mechanism responsible for the depressed reflex cholinergic NO-dependent coronary vasodilation was the decreased bioactivity of NO from the vascular endothelium. The goal of this study was to investigate changes in cardiac gene expression in a canine model of alloxan-induced type 1 diabetes.</p> <p>Methods</p> <p>Mongrel dogs were chronically instrumented and the dogs were divided into two groups: one normal and the other diabetic. In the diabetic group, the dogs were injected with alloxan monohydrate (40-60 mg/kg iv) over 1 min. The global changes in cardiac gene expression in dogs with alloxan-induced diabetes were studied using Affymetrix Canine Array. Cardiac RNA was extracted from the control and DM (n = 4).</p> <p>Results</p> <p>The array data revealed that 797 genes were differentially expressed (P < 0.01; fold change of at least ±2). 150 genes were expressed at significantly greater levels in diabetic dogs and 647 were significantly reduced. There was no change in eNOS mRNA. There was up regulation of some components of the NADPH oxidase subunits (gp91 by 2.2 fold, P < 0.03), and down-regulation of SOD1 (3 fold, P < 0.001) and decrease (4 - 40 fold) in a large number of genes encoding mitochondrial enzymes. In addition, there was down-regulation of Ca<sup>2+ </sup>cycling genes (ryanodine receptor; SERCA2 Calcium ATPase), structural proteins (actin alpha). Of particular interests are genes involved in glutathione metabolism (glutathione peroxidase 1, glutathione reductase and glutathione S-transferase), which were markedly down regulated.</p> <p>Conclusion</p> <p>our findings suggest that type I diabetes might have a direct effect on the heart by impairing NO bioavailability through oxidative stress and perhaps lipid peroxidases.</p

    Evolutionary Patterning: A Novel Approach to the Identification of Potential Drug Target Sites in Plasmodium falciparum

    Get PDF
    Malaria continues to be the most lethal protozoan disease of humans. Drug development programs exhibit a high attrition rate and parasite resistance to chemotherapeutic drugs exacerbates the problem. Strategies that limit the development of resistance and minimize host side-effects are therefore of major importance. In this study, a novel approach, termed evolutionary patterning (EP), was used to identify suitable drug target sites that would minimize the emergence of parasite resistance. EP uses the ratio of non-synonymous to synonymous substitutions (ω) to assess the patterns of evolutionary change at individual codons in a gene and to identify codons under the most intense purifying selection (ω≤0.1). The extreme evolutionary pressure to maintain these residues implies that resistance mutations are highly unlikely to develop, which makes them attractive chemotherapeutic targets. Method validation included a demonstration that none of the residues providing pyrimethamine resistance in the Plasmodium falciparum dihydrofolate reductase enzyme were under extreme purifying selection. To illustrate the EP approach, the putative P. falciparum glycerol kinase (PfGK) was used as an example. The gene was cloned and the recombinant protein was active in vitro, verifying the database annotation. Parasite and human GK gene sequences were analyzed separately as part of protozoan and metazoan clades, respectively, and key differences in the evolutionary patterns of the two molecules were identified. Potential drug target sites containing residues under extreme evolutionary constraints were selected. Structural modeling was used to evaluate the functional importance and drug accessibility of these sites, which narrowed down the number of candidates. The strategy of evolutionary patterning and refinement with structural modeling addresses the problem of targeting sites to minimize the development of drug resistance. This represents a significant advance for drug discovery programs in malaria and other infectious diseases

    Large-Scale Phylogenetic Analysis of Emerging Infectious Diseases

    Get PDF
    Microorganisms that cause infectious diseases present critical issues of national security, public health, and economic welfare.  For example, in recent years, highly pathogenic strains of avian influenza have emerged in Asia, spread through Eastern Europe and threaten to become pandemic. As demonstrated by the coordinated response to Severe Acute Respiratory Syndrome (SARS) and influenza, agents of infectious disease are being addressed via large-scale genomic sequencing.  The goal of genomic sequencing projects are to rapidly put large amounts of data in the public domain to accelerate research on disease surveillance, treatment, and prevention. However, our ability to derive information from large comparative genomic datasets lags far behind acquisition.  Here we review the computational challenges of comparative genomic analyses, specifically sequence alignment and reconstruction of phylogenetic trees.  We present novel analytical results on from two important infectious diseases, Severe Acute Respiratory Syndrome (SARS) and influenza.SARS and influenza have similarities and important differences both as biological and comparative genomic analysis problems.  Influenza viruses (Orthymxyoviridae) are RNA based.  Current evidence indicates that influenza viruses originate in aquatic birds from wild populations. Influenza has been studied for decades via well-coordinated international efforts.  These efforts center on surveillance via antibody characterization of the hemagglutinin (HA) and neuraminidase (N) proteins of the circulating strains to inform vaccine design. However we still do not have a clear understanding of: 1) various transmission pathways such as the role of intermediate hosts such as swine and domestic birds and 2) the key mutation and genomic recombination events that underlie periodic pandemics of influenza.  In the past 30 years, sequence data from HA and N loci has become an important data type. In the past year, full genomic data has become prominent.  These data present exciting opportunities to address unanswered questions in influenza pandemics.SARS is caused by a previously unrecognized lineage of coronavirus, SARS-CoV, which like influenza has an RNA based genome.  Although SARS-CoV is widely believed to have originated in animals there remains disagreement over the candidate animal source that lead to the original outbreak of SARS.  In contrast to the long history of the study of influenza, SARS was only recognized in late 2002 and the virus that causes SARS has been documented primarily by genomic sequencing.In the past, most studies of influenza were performed on a limited number of isolates and genes suited to a particular problem.  Major goals in science today are to understand emerging diseases in broad geographic, environmental, societal, biological, and genomic contexts. Synthesizing diverse information brought together by various researchers is important to find out what can be done to prevent future outbreaks {JON03}.  Thus comprehensive means to organize and analyze large amounts of diverse information are critical.  For example, the relationships of isolates and patterns of genomic change observed in large datasets might not be consistent with hypotheses formed on partial data.  Moreover when researchers rely on partial datasets, they restrict the range of possible discoveries.Phylogenetics is well suited to the complex task of understanding emerging infectious disease. Phylogenetic analyses can test many hypotheses by comparing diverse isolates collected from various hosts, environments, and points in time and organizing these data into various evolutionary scenarios.  The products of a phylogenetic analysis are a graphical tree of ancestor-descendent relationships and an inferred summary of mutations, recombination events, host shifts, geographic, and temporal spread of the viruses.  However, this synthesis comes at a price.  The cost of computation of phylogenetic analysis expands combinatorially as the number of isolates considered increases. Thus, large datasets like those currently produced are commonly considered intractable.  We address this problem with synergistic development of heuristics tree search strategies and parallel computing.Fil: Janies, D.. Ohio State University; Estados UnidosFil: Pol, Diego. Ohio State University; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

    Disease-Aging Network Reveals Significant Roles of Aging Genes in Connecting Genetic Diseases

    Get PDF
    One of the challenging problems in biology and medicine is exploring the underlying mechanisms of genetic diseases. Recent studies suggest that the relationship between genetic diseases and the aging process is important in understanding the molecular mechanisms of complex diseases. Although some intricate associations have been investigated for a long time, the studies are still in their early stages. In this paper, we construct a human disease-aging network to study the relationship among aging genes and genetic disease genes. Specifically, we integrate human protein-protein interactions (PPIs), disease-gene associations, aging-gene associations, and physiological system–based genetic disease classification information in a single graph-theoretic framework and find that (1) human disease genes are much closer to aging genes than expected by chance; and (2) diseases can be categorized into two types according to their relationships with aging. Type I diseases have their genes significantly close to aging genes, while type II diseases do not. Furthermore, we examine the topological characters of the disease-aging network from a systems perspective. Theoretical results reveal that the genes of type I diseases are in a central position of a PPI network while type II are not; (3) more importantly, we define an asymmetric closeness based on the PPI network to describe relationships between diseases, and find that aging genes make a significant contribution to associations among diseases, especially among type I diseases. In conclusion, the network-based study provides not only evidence for the intricate relationship between the aging process and genetic diseases, but also biological implications for prying into the nature of human diseases

    Diabetes Alters Intracellular Calcium Transients in Cardiac Endothelial Cells

    Get PDF
    Diabetic cardiomyopathy (DCM) is a diabetic complication, which results in myocardial dysfunction independent of other etiological factors. Abnormal intracellular calcium ([Ca2+]i) homeostasis has been implicated in DCM and may precede clinical manifestation. Studies in cardiomyocytes have shown that diabetes results in impaired [Ca2+]i homeostasis due to altered sarcoplasmic reticulum Ca2+ ATPase (SERCA) and sodium-calcium exchanger (NCX) activity. Importantly, altered calcium homeostasis may also be involved in diabetes-associated endothelial dysfunction, including impaired endothelium-dependent relaxation and a diminished capacity to generate nitric oxide (NO), elevated cell adhesion molecules, and decreased angiogenic growth factors. However, the effect of diabetes on Ca2+ regulatory mechanisms in cardiac endothelial cells (CECs) remains unknown. The objective of this study was to determine the effect of diabetes on [Ca2+]i homeostasis in CECs in the rat model (streptozotocin-induced) of DCM. DCM-associated cardiac fibrosis was confirmed using picrosirius red staining of the myocardium. CECs isolated from the myocardium of diabetic and wild-type rats were loaded with Fura-2, and UTP-evoked [Ca2+]i transients were compared under various combinations of SERCA, sarcoplasmic reticulum Ca2+ ATPase (PMCA) and NCX inhibitors. Diabetes resulted in significant alterations in SERCA and NCX activities in CECs during [Ca2+]i sequestration and efflux, respectively, while no difference in PMCA activity between diabetic and wild-type cells was observed. These results improve our understanding of how diabetes affects calcium regulation in CECs, and may contribute to the development of new therapies for DCM treatment
    corecore