74 research outputs found
Developing improved MD codes for understanding processive cellulases
"The mechanism of action of cellulose-degrading enzymes is illuminated through a multidisciplinary collaboration that uses molecular dynamics (MD) simulations and expands the capabilities of MD codes to allow simulations of enzymes and substrates on petascale computational facilities. There is a class of glycoside hydrolase enzymes called cellulases that are thought to decrystallize and processively depolymerize cellulose using biochemical processes that are largely not understood. Understanding the mechanisms involved and improving the efficiency of this hydrolysis process through computational models and protein engineering presents a compelling grand challenge. A detailed understanding of cellulose structure, dynamics and enzyme function at the molecular level is required to direct protein engineers to the right modifications or to understand if natural thermodynamic or kinetic limits are in play. Much can be learned about processivity by conducting carefully designed molecular dynamics (MD) simulations of the binding and catalytic domains of cellulases with various substrate configurations, solvation models and thermodynamic protocols. Most of these numerical experiments, however, will require significant modification of existing code and algorithms in order to efficiently use current (terascale) and future (petascale) hardware to the degree of parallelism necessary to simulate a system of the size proposed here. This work will develop MD codes that can efficiently use terascale and petascale systems, not just for simple classical MD simulations, but also for more advanced methods, including umbrella sampling with complex restraints and reaction coordinates, transition path sampling, steered molecular dynamics, and quantum mechanical/molecular mechanical simulations of systems the size of cellulose degrading enzymes acting on cellulose."http://deepblue.lib.umich.edu/bitstream/2027.42/64203/1/jpconf8_125_012049.pd
BESC knowledgebase public portal†
The BioEnergy Science Center (BESC) is undertaking large experimental campaigns to understand the biosynthesis and biodegradation of biomass and to develop biofuel solutions. BESC is generating large volumes of diverse data, including genome sequences, omics data and assay results. The purpose of the BESC Knowledgebase is to serve as a centralized repository for experimentally generated data and to provide an integrated, interactive and user-friendly analysis framework. The Portal makes available tools for visualization, integration and analysis of data either produced by BESC or obtained from external resources
Despite WT1 binding sites in the promoter region of human and mouse nucleoporin glycoprotein 210, WT1 does not influence expression of GP210
BACKGROUND: Glycoprotein 210 (GP210) is a transmembrane component of the nuclear pore complex of metazoans, with a short carboxyterminus protruding towards the cytoplasm. Its function is unknown, but it is considered to be a major structural component of metazoan nuclear pores. Yet, our previous findings showed pronounced differences in expression levels in embryonic mouse tissues and cell lines. In order to identify factors regulating GP210, the genomic organization of human GP210 was analyzed in silico. RESULTS: The human gene was mapped to chromosome 3 and consists of 40 exons spread over 102 kb. The deduced 1887 amino acid showed a high degree of alignment homology to previously reported orthologues. Experimentally we defined two transcription initiation sites, 18 and 29 bp upstream of the ATG start codon. The promoter region is characterized by a CpG island and several consensus binding motifs for gene regulatory transcription factors, including clustered sites associated with Sp1 and the Wilms' tumor suppressor gene zinc finger protein (WT1). In addition, distal to the translation start we found a (GT)n repetitive sequence, an element known for its ability to bind WT1. Homologies for these motifs could be identified in the corresponding mouse genomic region. However, experimental tetracycline dependent induction of WT1 in SAOS osteosarcoma cells did not influence GP210 transcription. CONCLUSION: Although mouse GP210 was identified as an early response gene during induced metanephric kidney development, and WT1 binding sites were identified in the promoter region of the human GP210 gene, experimental modulation of WT1 expression did not influence expression of GP210. Therefore, WT1 is probably not regulating GP210 expression. Instead, we suggest that the identified Sp binding sites are involved
Genome Structure of the Legume, Lotus japonicus
The legume Lotus japonicus has been widely used as a model system to investigate the genetic background of legume-specific phenomena such as symbiotic nitrogen fixation. Here, we report structural features of the L. japonicus genome. The 315.1-Mb sequences determined in this and previous studies correspond to 67% of the genome (472 Mb), and are likely to cover 91.3% of the gene space. Linkage mapping anchored 130-Mb sequences onto the six linkage groups. A total of 10 951 complete and 19 848 partial structures of protein-encoding genes were assigned to the genome. Comparative analysis of these genes revealed the expansion of several functional domains and gene families that are characteristic of L. japonicus. Synteny analysis detected traces of whole-genome duplication and the presence of synteny blocks with other plant genomes to various degrees. This study provides the first opportunity to look into the complex and unique genetic system of legumes
Conserved synteny at the protein family level reveals genes underlying Shewanella species’ cold tolerance and predicts their novel phenotypes
© The Authors 2009. This article is distributed under the terms of the
Creative Commons Attribution Noncommercial License. The definitive version was published in Functional & Integrative Genomics 10 (2010): 97-110, doi:10.1007/s10142-009-0142-y.Bacteria of the genus Shewanella can thrive in different environments and demonstrate significant variability in their metabolic and ecophysiological capabilities including cold and salt tolerance. Genomic characteristics underlying this variability across species are largely unknown. In this study, we address the problem by a comparison of the physiological, metabolic, and genomic characteristics of 19 sequenced Shewanella species. We have employed two novel approaches based on association of a phenotypic trait with the number of the trait-specific protein families (Pfam domains) and on the conservation of synteny (order in the genome) of the trait-related genes. Our first approach is top-down and involves experimental evaluation and quantification of the species’ cold tolerance followed by identification of the correlated Pfam domains and genes with a conserved synteny. The second, a bottom-up approach, predicts novel phenotypes of the species by calculating profiles of each Pfam domain among their genomes and following pair-wise correlation of the profiles and their network clustering. Using the first approach, we find a link between cold and salt tolerance of the species and the presence in the genome of a Na+/H+ antiporter gene cluster. Other cold-tolerance-related genes include peptidases, chemotaxis sensory transducer proteins, a cysteine exporter, and helicases. Using the bottom-up approach, we found several novel phenotypes in the newly sequenced Shewanella species, including degradation of aromatic compounds by an aerobic hybrid pathway in Shewanella woodyi, degradation of ethanolamine by Shewanella benthica, and propanediol degradation by Shewanella putrefaciens CN32 and Shewanella sp. W3-18-1.This research was supported by the U.S. Department of Energy (DOE)
Office of Biological and Environmental Research under the Genomics:
GTL Program via the Shewanella Federation consortium
Initial sequencing and analysis of the human genome
The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62798/1/409860a0.pd
A Fast Look-Up Algorithm for Detecting Repetitive DNA Sequences
this paper, we focus on repetitive sequences in which short words are repeated many times, referred to as tandem repeats or microsatellites in the literature. We describe a linear time algorithm that can scan DNA sequences rapidly for tandem repeats. Recently, a new look-up technique, which uses indices calculated from non-contiguous, overlapping tuples, was employed in the FLASH program (Califano and Rigoutsos) for sequence homology search and is used here to recognize tandem repeats
Recommended from our members
2-D image segmentation using minimum spanning trees
This paper presents a new algorithm for partitioning a gray-level image into connected homogeneous regions. The novelty of this algorithm lies in the fact that by constructing a minimum spanning tree representation of a gray-level image, it reduces a region partitioning problem to a minimum spanning tree partitioning problem, and hence reduces the computational complexity of the region partitioning problem. The tree-partitioning algorithm, in essence, partitions a minimum spanning tree into subtrees, representing different homogeneous regions, by minimizing the sum of variations of gray levels over all subtrees under the constraints that each subtree should have at least a specified number of nodes, and two adjacent subtrees should have significantly different average gray-levels. Two (faster) heuristic implementations are also given for large-scale region partitioning problems. Test results have shown that the segmentation results are satisfactory and insensitive to noise
Recommended from our members
Overview of PSB track on gene structure identification in large-scale genomic sequence
The recent funding of more than a dozen major genome centers to begin community-wide high-throughput sequencing of the human genome has created a significant new challenge for the computational analysis of DNA sequence and the prediction of gene structure and function. It has been estimated that on average from 1996 to 2003, approximately 2 million bases of newly finished DNA sequence will be produced every day and be made available on the Internet and in central databases. The finished (fully assembled) sequence generated each day will represent approximately 75 new genes (and their respective proteins), and many times this number will be represented in partially completed sequences. The information contained in these is of immeasurable value to medical research, biotechnology, the pharmaceutical industry and researchers in a host of fields ranging from microorganism metabolism, to structural biology, to bioremediation. Sequencing of microorganisms and other model organisms is also ramping up at a very rapid rate. The genomes for yeast and several microorganisms such as H. influenza have recently been fully sequenced, although the significance of many genes remains to be determined
- …