222 research outputs found

    The origin of genetic coding and translation in protocells

    Get PDF
    The origin of the genetic code is a longstanding problem at the centre of biology. The code is the essential informational framework through which cells are able to self-replicate and evolve, but the process in which it emerged is unknown. One through-line in the literature is that the code may be an emergent property of the biochemistry of amino acid polymerization, but this is rarely stated explicitly. This thesis explores the chemistry of translation and investigates the idea that the polymerization mechanisms used by biology could occur spontaneously in mixtures of amino acids and nucleotides. The focus of this is the centrality of adenine nucleotides in translation. Initially, I utilise molecular dynamics simulations to explore the formation of ATP, a key reactant in translation, and link these results to prior laboratory studies. Using the same pipeline developed for these simulations, I find that adenine nucleotides readily self-organise in ways that should predispose the key chemical steps preceding amino acid polymerization to occur. However, the reason for adenine nucleotides’ universality in this process remains unclear and may either be related to their proclivity of to form triphosphates, or other chemical properties related to reactivity. Similar investigative approaches are then turned towards biological information. Through literature review, extensive molecular dynamics simulations and NMR, I present evidence that fundamental biophysical properties of amino acids and nucleotides can bias their interactions in ways which reproduce the patterns and structure of the modern genetic code. These patterns are principally related to hydrophobicity, which is identified as a significant factor for interactions, and elevated affinities for cognate anticodonic nucleotides. Overall, this thesis describes how spontaneous reactions and interactions may have provided sufficient foundations for the genetic code, foundations which were then augmented by the evolution of the biological translational machinery but never fully replaced

    The Balkan Definite Article and Pseudo-Second Position

    Get PDF
    Proceedings of the Eighteenth Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on The Place of Morphology in a Grammar (1992), pp. 338-34

    A biophysical basis for the emergence of the genetic code in protocells

    Get PDF
    The origin of the genetic code is an abiding mystery in biology. Hints of a 'code within the codons' suggest biophysical interactions, but these patterns have resisted interpretation. Here, we present a new framework, grounded in the autotrophic growth of protocells from CO2 and H2. Recent work suggests that the universal core of metabolism recapitulates a thermodynamically favoured protometabolism right up to nucleotide synthesis. Considering the genetic code in relation to an extended protometabolism allows us to predict most codon assignments. We show that the first letter of the codon corresponds to the distance from CO2 fixation, with amino acids encoded by the purines (G followed by A) being closest to CO2 fixation. These associations suggest a purine-rich early metabolism with a restricted pool of amino acids. The second position of the anticodon corresponds to the hydrophobicity of the amino acid encoded. We combine multiple measures of hydrophobicity to show that this correlation holds strongly for early amino acids but is weaker for later species. Finally, we demonstrate that redundancy at the third position is not randomly distributed around the code: non-redundant amino acids can be assigned based on size, specifically length. We attribute this to additional stereochemical interactions at the anticodon. These rules imply an iterative expansion of the genetic code over time with codon assignments depending on both distance from CO2 and biophysical interactions between nucleotide sequences and amino acids. In this way the earliest RNA polymers could produce non-random peptide sequences with selectable functions in autotrophic protocells

    Biophysical Interactions Underpin the Emergence of Information in the Genetic Code

    Get PDF
    The genetic code conceals a ‘code within the codons’, which hints at biophysical interactions between amino acids and their cognate nucleotides. Yet, research over decades has failed to corroborate systematic biophysical interactions across the code. Using molecular dynamics simulations and NMR, we have analysed interactions between the 20 standard proteinogenic amino acids and 4 RNA mononucleotides in 3 charge states. Our simulations show that 50% of amino acids bind best with their anticodonic middle base in the −1 charge state common to the backbone of RNA, while 95% of amino acids interact most strongly with at least 1 of their codonic or anticodonic bases. Preference for the cognate anticodonic middle base was greater than 99% of randomised assignments. We verify a selection of our results using NMR, and highlight challenges with both techniques for interrogating large numbers of weak interactions. Finally, we extend our simulations to a range of amino acids and dinucleotides, and corroborate similar preferences for cognate nucleotides. Despite some discrepancies between the predicted patterns and those observed in biology, the existence of weak stereochemical interactions means that random RNA sequences could template non-random peptides. This offers a compelling explanation for the emergence of genetic information in biology

    A prebiotic basis for ATP as the universal energy currency

    Get PDF
    ATP is universally conserved as the principal energy currency in cells, driving metabolism through phosphorylation and condensation reactions. Such deep conservation suggests that ATP arose at an early stage of biochemical evolution. Yet purine synthesis requires 6 phosphorylation steps linked to ATP hydrolysis. This autocatalytic requirement for ATP to synthesize ATP implies the need for an earlier prebiotic ATP equivalent, which could drive protometabolism before purine synthesis. Why this early phosphorylating agent was replaced, and specifically with ATP rather than other nucleoside triphosphates, remains a mystery. Here, we show that the deep conservation of ATP might reflect its prebiotic chemistry in relation to another universally conserved intermediate, acetyl phosphate (AcP), which bridges between thioester and phosphate metabolism by linking acetyl CoA to the substrate-level phosphorylation of ADP. We confirm earlier results showing that AcP can phosphorylate ADP to ATP at nearly 20% yield in water in the presence of Fe3+ ions. We then show that Fe3+ and AcP are surprisingly favoured. A wide range of prebiotically relevant ions and minerals failed to catalyse ADP phosphorylation. From a panel of prebiotic phosphorylating agents, only AcP, and to a lesser extent carbamoyl phosphate, showed any significant phosphorylating potential. Critically, AcP did not phosphorylate any other nucleoside diphosphate. We use these data, reaction kinetics, and molecular dynamic simulations to infer a possible mechanism. Our findings might suggest that the reason ATP is universally conserved across life is that its formation is chemically favoured in aqueous solution under mild prebiotic conditions

    Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, and Interpreting Novel, Deep Branches in Marker Gene Phylogenetic Trees

    Get PDF
    BACKGROUND: Most of our knowledge about the ancient evolutionary history of organisms has been derived from data associated with specific known organisms (i.e., organisms that we can study directly such as plants, metazoans, and culturable microbes). Recently, however, a new source of data for such studies has arrived: DNA sequence data generated directly from environmental samples. Such metagenomic data has enormous potential in a variety of areas including, as we argue here, in studies of very early events in the evolution of gene families and of species. METHODOLOGY/PRINCIPAL FINDINGS: We designed and implemented new methods for analyzing metagenomic data and used them to search the Global Ocean Sampling (GOS) expedition data set for novel lineages in three gene families commonly used in phylogenetic studies of known and unknown organisms: small subunit rRNA and the recA and rpoB superfamilies. Though the methods available could not accurately identify very deeply branched ss-rRNAs (largely due to difficulties in making robust sequence alignments for novel rRNA fragments), our analysis revealed the existence of multiple novel branches in the recA and rpoB gene families. Analysis of available sequence data likely from the same genomes as these novel recA and rpoB homologs was then used to further characterize the possible organismal source of the novel sequences. CONCLUSIONS/SIGNIFICANCE: Of the novel recA and rpoB homologs identified in the metagenomic data, some likely come from uncharacterized viruses while others may represent ancient paralogs not yet seen in any cultured organism. A third possibility is that some come from novel cellular lineages that are only distantly related to any organisms for which sequence data is currently available. If there exist any major, but so-far-undiscovered, deeply branching lineages in the tree of life, we suggest that methods such as those described herein currently offer the best way to search for them

    Molecularly Defined and Spatially Resolved Cell Atlas of the Whole Mouse Brain

    Get PDF
    In mammalian brains, millions to billions of cells form complex interaction networks to enable a wide range of functions. The enormous diversity and intricate organization of cells have impeded our understanding of the molecular and cellular basis of brain function. Recent advances in spatially resolved single-cell transcriptomics have enabled systematic mapping of the spatial organization of molecularly defined cell types in complex tissues1-3, including several brain regions (for example, refs. 1-11). However, a comprehensive cell atlas of the whole brain is still missing. Here we imaged a panel of more than 1,100 genes in approximately 10 million cells across the entire adult mouse brains using multiplexed error-robust fluorescence in situ hybridization12 and performed spatially resolved, single-cell expression profiling at the whole-transcriptome scale by integrating multiplexed error-robust fluorescence in situ hybridization and single-cell RNA sequencing data. Using this approach, we generated a comprehensive cell atlas of more than 5,000 transcriptionally distinct cell clusters, belonging to more than 300 major cell types, in the whole mouse brain with high molecular and spatial resolution. Registration of this atlas to the mouse brain common coordinate framework allowed systematic quantifications of the cell-type composition and organization in individual brain regions. We further identified spatial modules characterized by distinct cell-type compositions and spatial gradients featuring gradual changes of cells. Finally, this high-resolution spatial map of cells, each with a transcriptome-wide expression profile, allowed us to infer cell-type-specific interactions between hundreds of cell-type pairs and predict molecular (ligand-receptor) basis and functional implications of these cell-cell interactions. These results provide rich insights into the molecular and cellular architecture of the brain and a foundation for functional investigations of neural circuits and their dysfunction in health and disease

    Nanoliter Reactors Improve Multiple Displacement Amplification of Genomes from Single Cells

    Get PDF
    Since only a small fraction of environmental bacteria are amenable to laboratory culture, there is great interest in genomic sequencing directly from single cells. Sufficient DNA for sequencing can be obtained from one cell by the Multiple Displacement Amplification (MDA) method, thereby eliminating the need to develop culture methods. Here we used a microfluidic device to isolate individual Escherichia coli and amplify genomic DNA by MDA in 60-nl reactions. Our results confirm a report that reduced MDA reaction volume lowers nonspecific synthesis that can result from contaminant DNA templates and unfavourable interaction between primers. The quality of the genome amplification was assessed by qPCR and compared favourably to single-cell amplifications performed in standard 50-μl volumes. Amplification bias was greatly reduced in nanoliter volumes, thereby providing a more even representation of all sequences. Single-cell amplicons from both microliter and nanoliter volumes provided high-quality sequence data by high-throughput pyrosequencing, thereby demonstrating a straightforward route to sequencing genomes from single cells
    corecore