130 research outputs found

    The Triplet Genetic Code had a Doublet Predecessor

    Full text link
    Information theoretic analysis of genetic languages indicates that the naturally occurring 20 amino acids and the triplet genetic code arose by duplication of 10 amino acids of class-II and a doublet genetic code having codons NNY and anticodons GNN\overleftarrow{\rm GNN}. Evidence for this scenario is presented based on the properties of aminoacyl-tRNA synthetases, amino acids and nucleotide bases.Comment: 10 pages (v2) Expanded to include additional features, including likely relation to the operational code of the tRNA-acceptor stem. Version to be published in Journal of Theoretical Biolog

    Towards Understanding the Origin of Genetic Languages

    Full text link
    Molecular biology is a nanotechnology that works--it has worked for billions of years and in an amazing variety of circumstances. At its core is a system for acquiring, processing and communicating information that is universal, from viruses and bacteria to human beings. Advances in genetics and experience in designing computers have taken us to a stage where we can understand the optimisation principles at the root of this system, from the availability of basic building blocks to the execution of tasks. The languages of DNA and proteins are argued to be the optimal solutions to the information processing tasks they carry out. The analysis also suggests simpler predecessors to these languages, and provides fascinating clues about their origin. Obviously, a comprehensive unraveling of the puzzle of life would have a lot to say about what we may design or convert ourselves into.Comment: (v1) 33 pages, contributed chapter to "Quantum Aspects of Life", edited by D. Abbott, P. Davies and A. Pati, (v2) published version with some editin

    Genetic Code: A New Understanding of Codon - Amino Acid Assignment

    Full text link
    In this work it is shown that 20 canonical amino acids (AAs) within genetic code appear to be a whole system with strict AAs positions; more exactly, with AAs ordinal number in three variants; first variant 00-19, second 00-21 and third 00-20. The ordinal number follows from the positions of belonging codons, i.e. their digrams (or doublets). The reading itself is a reading in quaternary numbering system if four bases possess the values within a specific logical square: A = 0, C = 1, G = 2, U = 3. By this, all splittings, distinctions and classifications of AAs appear to be in accordance to atom and nucleon number balance as well as to the other physico-chemical properties, such as hydrophobicity and polarity.Comment: 25 Pages, 8 Tables, 5 Figures and 5 Surveys. The paper is submitting to GLASNIK of Montenegrin Academy of Science and Arts as a extended version of paper published in Ann. N.Y. Acad. Sci. 1048: 517-523 (2005

    Codon Size Reduction as the Origin of the Triplet Genetic Code

    Get PDF
    The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon-anticodon interactions

    Identification and analysis of patterns in DNA sequences, the genetic code and transcriptional gene regulation

    Get PDF
    The present cumulative work consists of six articles linked by the topic ”Identification and Analysis of Patterns in DNA sequences, the Genetic Code and Transcriptional Gene Regulation”. We have applied a binary coding, to efficiently findpatterns within nucleotide sequences. In the first and second part of my work one single bit to encode all four nucleotides is used. The three possibilities of a one - bit coding are: keto (G,U) - amino (A,C) bases, strong (G,C) - weak (A,U) bases, and purines (G,A) - pyrimidines (C,U). We found out that the best pattern could be observed using the purine - pyrimidine coding. Applying this coding we have succeeded in finding a new representation of the genetic code which has been published under the title ”A New Classification Scheme of the Genetic Code” in ”Journal of Molecular Biology” and ”A Purine-Pyrimidine Classification Scheme of the Genetic Code” in ”BIOForum Europe”. This new representation enables to reduce the common table of the genetic code from 64 to 32 fields maintaining the same information content. It turned out that all known and even new patterns of the genetic code can easily be recognized in this new scheme. Furthermore, our new representation allows us for speculations about the origin and evolution of the translation machinery and the genetic code. Thus, we found a possible explanation for the contemporary codon - amino acid assignment and wide support for an early doublet code. Those explanations have been published in ”Journal of Bioinformatics and Computational Biology” under the title ”The New Classification Scheme of the Genetic Code, its Early Evolution, and tRNA Usage”. Assuming to find these purine - pyrimidine patterns at the DNA level itself, we examined DNA binding sites for the occurrence of binary patterns. A comprehensive statistic about the largest class of restriction enzymes (type II) has shown a very distinctive purine - pyrimidine pattern. Moreover, we have observed a higher G+C content for the protein binding sequences. For both observations we have provided and discussed several explanations published under the title ”Common Patterns in Type II Restriction Enzyme Binding Sites” in ”Nucleic Acid Research”. The identified patterns may help to understand how a protein finds its binding site. In the last part of my work two submitted articles about the analysis of Boolean functions are presented. Boolean functions are used for the description and analysis of complex dynamic processes and make it easier to find binary patterns within biochemical interaction networks. It is well known that not all functions are necessary to describe biologically relevant gene interaction networks. In the article entitled ”Boolean Networks with Biologically Relevant Rules Show Ordered Behavior”, submitted to ”BioSystems”, we have shown, that the class of required Boolean functions can strongly be restricted. Furthermore, we calculated the exact number of hierarchically canalizing functions which are known to be biologically relevant. In our work ”The Decomposition Tree for Analysis of Boolean Functions” submitted to ”Journal of Complexity”, we introduced an efficient data structure for the classification and analysis of Boolean functions. This permits the recognition of biologically relevant Boolean functions in polynomial time
    corecore