4,952 research outputs found

    RNA SEQUENCE DETERMINANTS OF A COUPLED TERMINATION-REINITIATION STRATEGY FOR TRANSLATION OF DOWNSTREAM ORF IN HELMINTHOSPORIUM VICTORIAE VIRUS 190S AND OTHER VICTORIVIRUSES (FAMILY \u3cem\u3eTOTIVIRIDAE\u3c/em\u3e)

    Get PDF
    Double-stranded RNA fungal virus Helminthosporium victoriae virus 190S (genus Victorivirus, family Totiviridae) contains two large open reading frames (ORFs) that overlap in the tetranucleotide AUGA. Translation of the downstream ORF, which encodes the RNA-dependent RNA polymerase (RdRp), was previously proposed to depend on ribosomal reinitiation following termination of the upstream ORF, which encodes the capsid protein. In this study, I provided evidence to confirm that coupled termination-reinitiation (stop-restart) is indeed used. A dual-fluorescence method was established to define the RNA sequence determinants for RdRp translation. Stop-restart depends on a 32-nt stretch of RNA sequence immediately upstream of the AUGA motif, including a predicted pseudoknot structure. The presence of similar sequence motifs and predicted RNA structures in other victoriviruses suggest that they all share a related stop–restart strategy for RdRp translation. The close proximity of the secondary structure to the AUGA motif appears to be especially important for promoting translation of the downstream ORF. Normal strong preferences for AUG start codons and canonical sequence context for translation initiation of the downstream ORF appear somewhat relaxed. With dual-fluorescence system, reinitiation efficiency of the downstream ORF was determined to be ~3.9%. Pseudoknot swapping between the one in HvV190S and those predicted from other victoriviruses showed that reinitiation from the downstream ORF of HvV190S is quite tolerant to varying primary sequences of the various pseudoknots. Mutational analysis by introducing different combinations of nucleotide mutations into pseudoknot stems reproducibly confirmed the determinant role of pseudoknot on reinitiation using two different experimental systems. Together, these results provide the first example of coupled termination-reinitiation regulated by a simple pseudoknot stucture. These data expanded the understanding of coupled termination-reinitiation mechanism employed by RNA viruses and refined a new model for genus victorivirus, the largest genus in the family Totiviridae. The dual fluorescence system used in this study represented the first application of an efficient in vivo assay for recording low-frequency events in filamentous fungi

    Computational Methods for Comparative Non-coding RNA Analysis: from Secondary Structures to Tertiary Structures

    Get PDF
    Unlike message RNAs (mRNAs) whose information is encoded in the primary sequences, the cellular roles of non-coding RNAs (ncRNAs) originate from the structures. Therefore studying the structural conservation in ncRNAs is important to yield an in-depth understanding of their functionalities. In the past years, many computational methods have been proposed to analyze the common structural patterns in ncRNAs using comparative methods. However, the RNA structural comparison is not a trivial task, and the existing approaches still have numerous issues in efficiency and accuracy. In this dissertation, we will introduce a suite of novel computational tools that extend the classic models for ncRNA secondary and tertiary structure comparisons. For RNA secondary structure analysis, we first developed a computational tool, named PhyloRNAalifold, to integrate the phylogenetic information into the consensus structural folding. The underlying idea of this algorithm is that the importance of a co-varying mutation should be determined by its position on the phylogenetic tree. By assigning high scores to the critical covariances, the prediction of RNA secondary structure can be more accurate. Besides structure prediction, we also developed a computational tool, named ProbeAlign, to improve the efficiency of genome-wide ncRNA screening by using high-throughput RNA structural probing data. It treats the chemical reactivities embedded in the probing information as pairing attributes of the searching targets. This approach can avoid the time-consuming base pair matching in the secondary structure alignment. The application of ProbeAlign to the FragSeq datasets shows its capability of genome-wide ncRNAs analysis. For RNA tertiary structure analysis, we first developed a computational tool, named STAR3D, to find the global conservation in RNA 3D structures. STAR3D aims at finding the consensus of stacks by using 2D topology and 3D geometry together. Then, the loop regions can be ordered and aligned according to their relative positions in the consensus. This stack-guided alignment method adopts the divide-and-conquer strategy into RNA 3D structural alignment, which has improved its efficiency dramatically. Furthermore, we also have clustered all loop regions in non-redundant RNA 3D structures to de novo detect plausible RNA structural motifs. The computational pipeline, named RNAMSC, was extended to handle large-scale PDB datasets, and solid downstream analysis was performed to ensure the clustering results are valid and easily to be applied to further research. The final results contain many interesting variations of known motifs, such as GNAA tetraloop, kink-turn, sarcin-ricin and t-loops. We also discovered novel functional motifs that conserved in a wide range of ncRNAs, including ribosomal RNA, sgRNA, SRP RNA, GlmS riboswitch and twister ribozyme

    Origin and higher-level diversification of acariform mites – evidence from nuclear ribosomal genes, extensive taxon sampling, and secondary structure alignment

    Get PDF
    Abstract Background Acariformes is the most species-rich and morphologically diverse radiation of chelicerate arthropods, known from the oldest terrestrial ecosystems. It is also a key lineage in understanding the evolution of this group, with the most vexing question whether mites, or Acari (Parasitiformes and Acariformes) is monophyletic. Previous molecular studies recovered Acari either as monophyletic or non-monophyletic, albeit with a limited taxon sampling. Similarly, relationships between basal acariform groups (include little-known, deep-soil 'endeostigmatan' mites) and major lineages of Acariformes (Sarcoptiformes, Prostigmata) are virtually unknown. We infer phylogeny of chelicerate arthropods, using a large and representative dataset, comprising all main in- and outgroups (228 taxa). Basal diversity of Acariformes is particularly well sampled. With this dataset, we conduct a series of phylogenetically explicit tests of chelicerate and acariform relationships and present a phylogenetic framework for internal relationships of acariform mites. Results Our molecular data strongly support a diphyletic Acari, with Acariformes as the sister group to Solifugae (PP =1.0; BP = 100), the so called Poecilophysidea. Among Acariformes, some representatives of the basal group Endeostigmata (mainly deep-soil mites) were recovered as sister-groups to the remaining Acariformes (i. e., Trombidiformes + and most of Sarcoptiformes). Desmonomatan oribatid mites (soil and litter mites) were recovered as the monophyletic sister group of Astigmata (e. g., stored product mites, house dust mites, mange mites, feather and fur mites). Trombidiformes (Sphaerolichida + Prostigmata) is strongly supported (PP =1.0; BP = 98–100). Labidostommatina was inferred as the basal lineage of Prostigmata. Eleutherengona (e. g., spider mites) and Parasitengona (e. g., chiggers, fresh water mites) were recovered as monophyletic. By contrast, Eupodina (e. g., snout mites and relatives) was not. Marine mites (Halacaridae) were traditionally regarded as the sister-group to Bdelloidea (Eupodina), but our analyses show their close relationships to Parasitengona. Conclusions Non-trivial relationships recovered by our analyses with high support (i.e., basal arrangement of endeostigmatid lineages, the position of marine mites, polyphyly of Eupodina) had been  proposed by previous underappreciated morphological studies. Thus, we update currently the accepted taxonomic classification to reflect these results: the superfamily Halacaroidea Murray, 1877 is moved from the infraorder Eupodina Krantz, 1978 to Anystina van der Hammen, 1972; and the subfamily Erythracarinae Oudemans, 1936 (formerly in Anystidae Oudemans, 1902) is elevated to family rank, Erythracaridae stat. ressur., leaving Anystidae only with the nominal subfamily. Our study also shows that a clade comprising early derivative Endeostigmata (Alycidae, Nanorchestidae, Nematalycidae, and maybe Alicorhagiidae) should be treated as a taxon with the same rank as Sarcoptiformes and Trombidiformes, and the scope of the superfamily Bdelloidea should  be changed. Before turning those findings into nomenclatural changes, however, we consider that our study calls for (i) finding shared apomorphies of the early derivative Endeostigmata clade and the clade including the remaining Acariformes; (ii) a well-supported hypothesis  for Alicorhagiidae placement; (iii) sampling the families Proterorhagiidae, Proteonematalycidae and Grandjeanicidae not yet included in molecular analyses; (iv) undertake a denser sampling of clades traditionally placed in Eupodina, Anystina (Trombidiformes) and Palaeosomata (Sarcoptiformes), since consensus networks and Internode certainty (IC) and IC All (ICA) indices indicate high levels of conflict in these tree regions. Our study shows that regions of ambiguous alignment may provide useful phylogenetic signal when secondary structure information is used to guide the alignment procedure and provides an R implementation to the Bayesian Relative Rates test.http://deepblue.lib.umich.edu/bitstream/2027.42/113097/1/12862_2015_Article_458.pd

    Distinctive mitochondrial genome of Calanoid copepod Calanus sinicus with multiple large non-coding regions and reshuffled gene order: Useful molecular markers for phylogenetic and population studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Copepods are highly diverse and abundant, resulting in extensive ecological radiation in marine ecosystems. <it>Calanus sinicus </it>dominates continental shelf waters in the northwest Pacific Ocean and plays an important role in the local ecosystem by linking primary production to higher trophic levels. A lack of effective molecular markers has hindered phylogenetic and population genetic studies concerning copepods. As they are genome-level informative, mitochondrial DNA sequences can be used as markers for population genetic studies and phylogenetic studies.</p> <p>Results</p> <p>The mitochondrial genome of <it>C. sinicus </it>is distinct from other arthropods owing to the concurrence of multiple non-coding regions and a reshuffled gene arrangement. Further particularities in the mitogenome of <it>C. sinicus </it>include low A + T-content, symmetrical nucleotide composition between strands, abbreviated stop codons for several PCGs and extended lengths of the genes <it>atp6 </it>and <it>atp8 </it>relative to other copepods. The monophyletic Copepoda should be placed within the Vericrustacea. The close affinity between Cyclopoida and Poecilostomatoida suggests reassigning the latter as subordinate to the former. Monophyly of Maxillopoda is rejected. Within the alignment of 11 <it>C. sinicus </it>mitogenomes, there are 397 variable sites harbouring three 'hotspot' variable sites and three microsatellite loci.</p> <p>Conclusion</p> <p>The occurrence of the <it>circular subgenomic fragment </it>during laboratory assays suggests that special caution should be taken when sequencing mitogenomes using long PCR. Such a phenomenon may provide additional evidence of mitochondrial DNA recombination, which appears to have been a prerequisite for shaping the present mitochondrial profile of <it>C. sinicus </it>during its evolution. The lack of synapomorphic gene arrangements among copepods has cast doubt on the utility of gene order as a useful molecular marker for deep phylogenetic analysis. However, mitochondrial genomic sequences have been valuable markers for resolving phylogenetic issues concerning copepods. The variable site maps of <it>C. sinicus </it>mitogenomes provide a solid foundation for population genetic studies.</p

    RNA structure analysis : algorithms and applications

    Get PDF
    In this doctoral thesis, efficient algorithms for aligning RNA secondary structures and mining unknown RNA motifs are presented. As the major contribution, a structure alignment algorithm, which combines both primary and secondary structure information, can find the optimal alignment between two given structures where one of them could be either a pattern structure of a known motif or a real query structure and the other be a subject structure. Motivated by widely used algorithms for RNA folding, the proposed algorithm decomposes an RNA secondary structure into a set of atomic structural components that can be further organized in a tree model to capture the structural particularities. The novel structure alignment algorithm is implemented using dynamic programming techniques coupled by position-independent scoring matrices. The algorithm can find the optimal global and local alignments between two RNA secondary structures at quadratic time complexity. When applied to searching a structure database, the algorithm can find similar RNA substructures and therefore can be used to identify functional RNA motifs. Extension of the algorithm has also been accomplished to deal with position-dependent scoring matrix in the purpose of aligning multiple structures. All algorithms have been implemented in a package under the name RSmatch and applied to searching mRNA UTR structure database and mining RNA motifs. The experimental results showed high efficiency and effectiveness of the proposed techniques

    Mining characteristic relations bind to RNA secondary structures

    Full text link
    The identification of RNA secondary structures has been among the most exciting recent developments in biology and medical science. It has been recognized that there is an abundance of functional structures with frameshifting, regulation of translation, and splicing functions. However, the inherent signal for secondary structures is weak and generally not straightforward due to complex interleaving substrings. This makes it difficult to explore their potential functions from various structure data. Our approach, based on a collection of predicted RNA secondary structures, allows us to efficiently capture interesting characteristic relations in RNA and bring out the top-ranked rules for specified association groups. Our results not only point to a number of interesting associations and include a brief biological interpretation to them. It assists biologists in sorting out the most significant characteristic structure patterns and predicting structurefunction relationships in RNA

    Investigating the translation of Cobra1: canonical expression is alternatively initiated from a non-AUG codon

    Get PDF
    COBRA1, co-factor of BRCA1, is a transcriptional regulator and a subunit of the Negative elongation complex also known as NELF-B. Although this protein was first designated as a cofactor of BRCA1 and hence acts accordingly, it was found later that it elicits a battery of response genes overlapping those regulated by BRCA1 in absence of BRCA1 itself. Cobra1 deletion is embryonic lethal and results in embryonic stem cells (ESC) differentiation independent of the typical pluripotency machinery. Moreover, it was found that it has a role in suppression of tumors\u27 growth and patients with poor prognosis of breast cancer had decreased levels of COBRA1. Paradoxically, levels of COBRA1 was found elevated in some upper gastro-intestinal tract tumors. Our understanding of the regulation of gene expression has been evolving as an important venue to explain gene product\u27s diversification. Alternative initiation of translation has been observed in many important genes and showed different subsequent phenotypes. In some cases, the discovered protein isoforms are not generated from the classically recognized Kozak/ATG system (i.e. Canonical initiation). Alternatively, their expression is initiated using a non-canonical mechanism resembling viral internal ribosomal entry site (IRES) pathway. Generation of different protein isoforms has been linked to paradoxes in the associated genes\u27 functions. Among the different functions observed are resistance to degradation, altered cellular localization and regulation of different cell cycle phases. In this study we have substantiated the hypothesis that Cobra1 has two protein isoforms, which might be one of the possible reasons for the associated paradoxes. We have used in-silico prediction analyses to verify that the 5\u27 un-translated region (5\u27UTR) of Cobra1 has the required sequences and complex RNA structures for non-canonical initiation. We also could detect these isoforms in endogenous mouse tissues from different strains and ages. Finally, we were able to induce the expression of the two isoforms ex-vivo and still could recognize the isoforms in flag-tag based systems

    Algorithms for RNA secondary structure analysis : prediction of pseudoknots and the consensus shapes approach

    Get PDF
    Reeder J. Algorithms for RNA secondary structure analysis : prediction of pseudoknots and the consensus shapes approach. Bielefeld (Germany): Bielefeld University; 2007.Our understanding of the role of RNA has undergone a major change in the last decade. Once believed to be only a mere carrier of information and structural component of the ribosomal machinery in the advent of the genomic age, it is now clear that RNAs play a much more active role. RNAs can act as regulators and can have catalytic activity - roles previously only attributed to proteins. There is still much speculation in the scientific community as to what extent RNAs are responsible for the complexity in higher organisms which can hardly be explained with only proteins as regulators. In order to investigate the roles of RNA, it is therefore necessary to search for new classes of RNA. For those and already known classes, analyses of their presence in different species of the tree of life will provide further insight about the evolution of biomolecules and especially RNAs. Since RNA function often follows its structure, the need for computer programs for RNA structure prediction is an immanent part of this procedure. The secondary structure of RNA - the level of base pairing - strongly determines the tertiary structure. As the latter is computationally intractable and experimentally expensive to obtain, secondary structure analysis has become an accepted substitute. In this thesis, I present two new algorithms (and a few variations thereof) for the prediction of RNA secondary structures. The first algorithm addresses the problem of predicting a secondary structure from a single sequence including RNA pseudoknots. Pseudoknots have been shown to be functionally relevant in many RNA mediated processes. However, pseudoknots are excluded from considerations by state-of-the-art RNA folding programs for reasons of computational complexity. While folding a sequence of length n into unknotted structures requires O(n^3) time and O(n^2) space, finding the best structure including arbitrary pseudoknots has been proven to be NP-complete. Nevertheless, I demonstrate in this work that certain types of pseudoknots can be included in the folding process with only a moderate increase of computational cost. In analogy to protein coding RNA, where a conserved encoded protein hints at a similar metabolic function, structural conservation in RNA may give clues to RNA function and to finding of RNA genes. However, structure conservation is more complex to deal with computationally than sequence conservation. The method considered to be at least conceptually the ideal approach in this situation is the Sankoff algorithm. It simultaneously aligns two sequences and predicts a common secondary structure. Unfortunately, it is computationally rather expensive - O(n^6) time and O(n^4) space for two sequences, and for more than two sequences it becomes exponential in the number of sequences! Therefore, several heuristic implementations emerged in the last decade trying to make the Sankoff approach practical by introducing pragmatic restrictions on the search space. In this thesis, I propose to redefine the consensus structure prediction problem in a way that does not imply a multiple sequence alignment step. For a family of RNA sequences, my method explicitly and independently enumerates the near-optimal abstract shape space and predicts an abstract shape as the consensus for all sequences. For each sequence, it delivers the thermodynamically best structure which has this shape. The technique of abstract shapes analysis is employed here for a synoptic view of the suboptimal folding space. As the shape space is much smaller than the structure space, and identification of common shapes can be done in linear time (in the number of shapes considered), the method is essentially linear in the number of sequences. Evaluations show that the new method compares favorably with available alternatives
    corecore