2,166 research outputs found

    No wisdom in the crowd: genome annotation at the time of big data - current status and future prospects

    Get PDF
    Science and engineering rely on the accumulation and dissemination of knowledge to make discoveries and create new designs. Discovery-driven genome research rests on knowledge passed on via gene annotations. In response to the deluge of sequencing big data, standard annotation practice employs automated procedures that rely on majority rules. We argue this hinders progress through the generation and propagation of errors, leading investigators into blind alleys. More subtly, this inductive process discourages the discovery of novelty, which remains essential in biological research and reflects the nature of biology itself. Annotation systems, rather than being repositories of facts, should be tools that support multiple modes of inference. By combining deduction, induction and abduction, investigators can generate hypotheses when accurate knowledge is extracted from model databases. A key stance is to depart from ‘the sequence tells the structure tells the function’ fallacy, placing function first. We illustrate our approach with examples of critical or unexpected pathways, using MicroScope to demonstrate how tools can be implemented following the principles we advocate. We end with a challenge to the reader

    Single amino acid substitutions in either YhjD or MsbA confer viability to 3-deoxy-d- manno -oct-2-ulosonic acid-depleted Escherichia coli

    Full text link
    The Escherichia coli K-12 strain KPM22, defective in synthesis of 3-deoxy-d- manno -oct-2-ulosonic acid (Kdo), is viable with an outer membrane (OM) composed predominantly of lipid IV A , a precursor of lipopolysaccharide (LPS) biosynthesis that lacks any glycosylation. To sustain viability, the presence of a second-site suppressor was proposed for transport of lipid IV A from the inner membrane (IM), thus relieving toxic side-effects of lipid IV A accumulation and providing sufficient amounts of LPS precursors to support OM biogenesis. We now report the identification of an arginine to cysteine substitution at position 134 of the conserved IM protein YhjD in KPM22 that acts as a compensatory suppressor mutation of the lethal δKdo phenotype. Further, the yhjD400 suppressor allele renders the LPS transporter MsbA dispensable for lipid IV A transmembrane trafficking. The independent derivation of a series of non-conditional KPM22-like mutants from the Kdo-dependent parent strain TCM15 revealed a second class of suppressor mutations localized to MsbA. Proline to serine substitutions at either residue 18 or 50 of MsbA relieved the Kdo growth dependence observed in the isogenic wild-type strain. The possible impact of these suppressor mutations on structure and function are discussed by means of a computationally derived threading model of MsbA.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/75126/1/MMI_6074_sm_Figure_S1.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/75126/2/j.1365-2958.2007.06074.x.pd

    A Gene Encoding Arginyl-tRNA Synthetase Is Located in the Upstream Region of the lysA Gene in Brevibacterium lactofermentum: Regulation of argS-lysA Cluster Expression by Arginine

    Get PDF
    International audienceThe Brevibacterium lactofermentum argS gene, which encodes an arginyl-tRNA synthetase, was identified in the upstream region of the lysA gene. The cloned gene was sequenced; it encodes a 550-amino-acid protein with an Mr of 59,797. The deduced amino acid sequence showed 28% identical and 49% similar residues when compared with the sequence of the Escherichia coli arginyl-tRNA synthetase. The B. lactofermentum enzyme showed the highly conserved motifs of class I aminoacyl-tRNA synthetases. Expression of the argS gene in B. lactofermentum and E. coli resulted in an increase in aminoacyl-tRNA synthetase activity, correlated with the presence in sodium dodecyl sulfate-polyacrylamide gels of a clear protein band that corresponds to this enzyme. One single transcript of about 3,000 nucleotides and corresponding to the B. lactofermentum argS-lysA operon was identified. The transcription of these genes is repressed by lysine and induced by arginine, showing an interesting pattern of biosynthetic interlock between the pathways of both amino acids in corynebacteria

    Subdivision of the helix-turn-helix GntR family of bacterial regulators in the FadR, HutC, MocR, and YtrA subfamilies

    Full text link
    Haydon and Guest (Haydon, D. J, and Guest, J. R. (1991) FEMS Microbiol Lett. 63, 291-295) first described the helix-turn-helix GntR family of bacterial regulators. They presented them as transcription factors sharing a similar N-terminal DNA-binding (D-b) domain, but they observed near-maximal divergence in the C-terminal effector-binding and oligomerization (E-b/O) domain. To elucidate this C-terminal heterogeneity, structural, phylogenetic, and functional analyses were performed on a family that now comprises about 270 members. Our comparative study first focused on the C-terminal E-b/O domains and next on DNA-binding domains and palindromic operator sequences, has classified the GntR members into four subfamilies that we called FadR, HutC, MocR, and YtrA. Among these subfamilies a degree of similarity of about 55% was observed throughout the entire sequence. Structure/function associations were highlighted although they were not absolutely stringent. The consensus sequences deduced for the DNA-binding domain were slightly different for each subfamily, suggesting that fusion between the D-b and E-b/O domains have occurred separately, with each subfamily having its own D-b domain ancestor. Moreover, the compilation of the known or predicted palindromic cis-acting elements has highlighted different operator sequences according to our subfamily subdivision. The observed C-terminal E-b/O domain heterogeneity was therefore reflected on the DNA-binding domain and on the cis-acting elements, suggesting the existence of a tight link between the three regions involved in the regulating process.Peer reviewe

    Tissue-Specific Differences in Human Transfer RNA Expression

    Get PDF
    Over 450 transfer RNA (tRNA) genes have been annotated in the human genome. Reliable quantitation of tRNA levels in human samples using microarray methods presents a technical challenge. We have developed a microarray method to quantify tRNAs based on a fluorescent dye-labeling technique. The first-generation tRNA microarray consists of 42 probes for nuclear encoded tRNAs and 21 probes for mitochondrial encoded tRNAs. These probes cover tRNAs for all 20 amino acids and 11 isoacceptor families. Using this array, we report that the amounts of tRNA within the total cellular RNA vary widely among eight different human tissues. The brain expresses higher overall levels of nuclear encoded tRNAs than every tissue examined but one and higher levels of mitochondrial encoded tRNAs than every tissue examined. We found tissue-specific differences in the expression of individual tRNA species, and tRNAs decoding amino acids with similar chemical properties exhibited coordinated expression in distinct tissue types. Relative tRNA abundance exhibits a statistically significant correlation to the codon usage of a collection of highly expressed, tissue-specific genes in a subset of tissues or tRNA isoacceptors. Our findings demonstrate the existence of tissue-specific expression of tRNA species that strongly implicates a role for tRNA heterogeneity in regulating translation and possibly additional processes in vertebrate organisms

    The use of information theory in evolutionary biology

    Full text link
    Information is a key concept in evolutionary biology. Information is stored in biological organism's genomes, and used to generate the organism as well as to maintain and control it. Information is also "that which evolves". When a population adapts to a local environment, information about this environment is fixed in a representative genome. However, when an environment changes, information can be lost. At the same time, information is processed by animal brains to survive in complex environments, and the capacity for information processing also evolves. Here I review applications of information theory to the evolution of proteins as well as to the evolution of information processing in simulated agents that adapt to perform a complex task.Comment: 25 pages, 7 figures. To appear in "The Year in Evolutionary Biology", of the Annals of the NY Academy of Science

    Computational and experimental approaches to chart the Escherichia coli cell-envelope-associated proteome and interactome

    Get PDF
    The bacterial cell-envelope consists of a complex arrangement of lipids, proteins and carbohydrates that serves as the interface between a microorganism and its environment or, with pathogens, a human host. Escherichia coli has long been investigated as a leading model system to elucidate the fundamental mechanisms underlying microbial cell-envelope biology. This includes extensive descriptions of the molecular identities, biochemical activities and evolutionary trajectories of integral transmembrane proteins, many of which play critical roles in infectious disease and antibiotic resistance. Strikingly, however, only half of the c. 1200 putative cell-envelope-related proteins of E. coli currently have experimentally attributed functions, indicating an opportunity for discovery. In this review, we summarize the state of the art of computational and proteomic approaches for determining the components of the E. coli cell-envelope proteome, as well as exploring the physical and functional interactions that underlie its biogenesis and functionality. We also provide a comprehensive comparative benchmarking analysis on the performance of different bioinformatic and proteomic methods commonly used to determine the subcellular localization of bacterial proteins

    The Escherichia coli transcriptome mostly consists of independently regulated modules

    Get PDF
    Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. A useful description of the TRN would decompose the transcriptome into targeted effects of individual transcriptional regulators. Here, we apply unsupervised machine learning to a diverse compendium of over 250 high-quality Escherichia coli RNA-seq datasets to identify 92 statistically independent signals that modulate the expression of specific gene sets. We show that 61 of these transcriptomic signals represent the effects of currently characterized transcriptional regulators. Condition-specific activation of signals is validated by exposure of E. coli to new environmental conditions. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Taken together, our results show that signal summation describes the composition of a model prokaryotic transcriptome
    corecore