1,928 research outputs found
DNA sequences classification and computation scheme based on the symmetry principle
The DNA sequences containing multifarious novel symmetrical structure frequently play crucial role in how genomes work. Here we present a new scheme for understanding the structural features and potential mathematical rules of symmetrical DNA sequences using a method containing stepwise classification and recursive computation. By defining the symmetry of DNA sequences, we classify all sequences and conclude a series of recursive equations for computing the quantity of all classes of sequences existing theoretically; moreover, the symmetries of the typical sequences at different levels are analyzed. The classification and quantitative relation demonstrate that DNA sequences have recursive and nested properties. The scheme may help us better discuss the formation and the growth mechanism of DNA sequences because it has a capability of educing the information about structure and quantity of longer sequences according to that of shorter sequences by some recursive rules. Our scheme may provide a new stepping stone to the theoretical characterization, as well as structural analysis, of DNA sequences
Encoding folding paths of RNA switches
RNA co-transcriptional folding has long been suspected to play an active role
in helping proper native folding of ribozymes and structured regulatory motifs
in mRNA untranslated regions. Yet, the underlying mechanisms and coding
requirements for efficient co-transcriptional folding remain unclear.
Traditional approaches have intrinsic limitations to dissect RNA folding paths,
as they rely on sequence mutations or circular permutations that typically
perturb both RNA folding paths and equilibrium structures. Here, we show that
exploiting sequence symmetries instead of mutations can circumvent this problem
by essentially decoupling folding paths from equilibrium structures of designed
RNA sequences. Using bistable RNA switches with symmetrical helices conserved
under sequence reversal, we demonstrate experimentally that native and
transiently formed helices can guide efficient co-transcriptional folding into
either long-lived structure of these RNA switches. Their folding path is
controlled by the order of helix nucleations and subsequent exchanges during
transcription, and may also be redirected by transient antisense interactions.
Hence, transient intra- and intermolecular base pair interactions can
effectively regulate the folding of nascent RNA molecules into different native
structures, provided limited coding requirements, as discussed from an
information theory perspective. This constitutive coupling between RNA
synthesis and RNA folding regulation may have enabled the early emergence of
autonomous RNA-based regulation networks.Comment: 9 pages, 6 figure
A new integrated symmetrical table for genetic codes
Degeneracy is a salient feature of genetic codes, because there are more
codons than amino acids. The conventional table for genetic codes suffers from
an inability of illustrating a symmetrical nature among genetic base codes. In
fact, because the conventional wisdom avoids the question, there is little
agreement as to whether the symmetrical nature actually even exists. A better
understanding of symmetry and an appreciation for its essential role in the
genetic code formation can improve our understanding of nature coding
processes. Thus, it is worth formulating a new integrated symmetrical table for
genetic codes, which is presented in this paper. It could be very useful to
understand the Nobel laureate Crick wobble hypothesis: how one transfer
ribonucleic acid can recognize two or more synonymous codons, which is an
unsolved fundamental question in biological science
Computational methods for the discovery and analysis of genes and other functional DNA sequences
The need for automating genome analysis is a result of the tremendous amount of genomic data. As of today, a high-throughput DNA sequencing machine can run millions of sequencing reactions in parallel, and it is becoming faster and cheaper to sequence the entire genome of an organism. Public databases containing genomic data are growing exponentially, and hence the rise in demand for intuitive automated methods of DNA analysis and subsequent gene identification. However, the complexity of gene organization makes automation a challenging task, and smart algorithm design and parallelization are necessary to perform accurate analyses in reasonable amounts of time. This work describes two such automated methods for the identification of novel genes within given DNA sequences. The first method utilizes negative selection patterns as an evolutionary rationale for the identification of additional members of a gene family. As input it requires a known protein coding gene in that family. The second method is a massively parallel data mining algorithm that searches a whole genome for inverted repeats (palindromic sequences) and identifies potential precursors of non-coding RNA genes. Both methods were validated successfully on the fully sequenced and well studied plant species, Arabidopsis thaliana --Abstract, page iv
Cell Rep
Adeno-associated virus (AAV) vectors have emerged as a gene-delivery platform with demonstrated safety and efficacy in a handful of clinical trials for monogenic disorders. However, limitations of the current generation vectors often prevent broader application of AAV gene therapy. Efforts to engineer AAV vectors have been hampered by a limited understanding of the structure-function relationship of the complex multimeric icosahedral architecture of the particle. To develop additional reagents pertinent to further our insight into AAVs, we inferred evolutionary intermediates of the viral capsid using ancestral sequence reconstruction. In-silico-derived sequences were synthesized de novo and characterized for biological properties relevant to clinical applications. This effort led to the generation of nine functional putative ancestral AAVs and the identification of Anc80, the predicted ancestor of the widely studied AAV serotypes 1, 2, 8, and 9, as a highly potent in vivo gene therapy vector for targeting liver, muscle, and retina.5DP1EY023177-03/DP/NCCDPHP CDC HHS/United StatesDP1 EY023177/EY/NEI NIH HHS/United StatesDP1 OD008267/OD/NIH HHS/United States2016-08-11T00:00:00Z26235624PMC453616
Structural study of CUG-repeating small RNAs complexed with silencing suppressor P19
The study is focused on structural aspects of interaction between silencing suppressor p19 and CUG-repeating small RNAs. The work involves crystal structure determination of a protein-unbound RNA form and RNA fragments of various lengths (19, 20, 21 nucleotides) complexed with p19-suppressor. Results prove the ability of silencing suppressor p19 to bind CUG-repeating small RNAs, as well as reveal features of U•U mismatches flanked by Watson-Crick C•G base pairs in p19-bound and p19-unbound states. In addition, structural data reveal a p19 specific site for anchoring extra nucleotides in small RNAs. In general, the study extends our knowledge about the mechanism of small RNA recognition by silencing suppressor p19
- …