7,600 research outputs found

    UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets

    Get PDF
    Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)

    Modelling co-transcriptional cleavage in the synthesis of yeast pre-rRNA

    Get PDF
    AbstractIn this paper we present a quantified model of the synthesis of pre-rRNAs in yeast. The chemical kinetics simulation software Dizzy has been used as both the modelling and simulation framework of our study. The simulations have been used to investigate the mechanism of co-transcriptional cleavage which can occur during the synthesis of pre-rRNAs.Throughout the paper we emphasise the strong role of experimental data both in shaping the model and in guiding the analysis which is carried out. Parameter estimation procedures have been used to fit the model to the data and we discuss the validation of the model against the available experimental data. Simulation based on Gillespie’s algorithm is considered to be the reference method for our analysis and a comparison with other simulators is reported. Finally, we define an extended model, that relaxes one of the assumptions of the initial model

    PIN domain of Nob1p is required for D-site cleavage in 20S pre-rRNA

    Get PDF
    Nob1p (Yor056c) is essential for processing of the 20S pre-rRNA to the mature 18S rRNA. It is part of a pre-40S ribosomal particle that is transported to the cytoplasm and subsequently cleaved at the 3' end of mature 18S rRNA (D-site). Nob1p is also reported to participate in proteasome biogenesis, and it was therefore unclear whether its primary activity is in ribosome synthesis. In this work, we describe a homology model of the PIN domain of Nob1p, which structurally mimics Mg(2+)-dependent exonucleases despite negligible similarity in primary sequence. Insights gained from this model were used to design a point mutation that was predicted to abolish the postulated enzymatic activity. Cells expressing Nob1p with this mutation failed to cleave the 20S pre-rRNA. This supports both the significance of the structural model and the idea that Nob1p is the long-sought D-site endonuclease

    Structural basis for the binding of IRES RNAs to the head of the ribosomal 40S subunit

    Get PDF
    Some viruses exploit internal initiation for their propagation in the host cell. This type of initiation is facilitated by structured elements (internal ribosome entry site, IRES) upstream of the initiator AUG and requires only a reduced number of canonical initiation factors. An important example are IRES of the virus family Dicistroviridae that bind to the inter-subunit side of the small ribosomal 40S subunit and lead to the formation of elongation-competent 80S ribosomes without the help of any initiation factor. Here, we present a comprehensive functional and structural analysis of eukaryotic-specific ribosomal protein rpS25 in the context of this type of initiation and propose a structural model explaining the essential involvement of rpS25 for hijacking the ribosome

    Getting Ready to Translate: Cytoplasmic Maturation of Eukaryotic Ribosomes

    Get PDF
    The ribosome is the 'universal ribozyme' that is responsible for the final step of decoding genetic information into proteins. While the function of the ribosome is being elucidated at the atomic level, in comparison, little is known regarding its assembly in vivo and intracellular transport. In contrast to prokaryotic ribosomes, the construction of eukaryotic ribosomes, which begins in the nucleolus, requires >200 evolutionary conserved non-ribosomal trans-acting factors, which transiently associate with pre-ribosomal subunits at distinct assembly stages and perform specific maturation steps. Notably, pre-ribosomal subunits are transported to the cytoplasm in a functionally inactive state where they undergo maturation prior to entering translation. In this review, I will summarize our current knowledge of the eukaryotic ribosome assembly pathway with emphasis on cytoplasmic maturation events that render pre-ribosomal subunits translation competent

    Extending bicluster analysis to annotate unclassified ORFs and predict novel functional modules using expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarrays have the capacity to measure the expressions of thousands of genes in parallel over many experimental samples. The unsupervised classification technique of bicluster analysis has been employed previously to uncover gene expression correlations over subsets of samples with the aim of providing a more accurate model of the natural gene functional classes. This approach also has the potential to aid functional annotation of unclassified open reading frames (ORFs). Until now this aspect of biclustering has been under-explored. In this work we illustrate how bicluster analysis may be extended into a 'semi-supervised' ORF annotation approach referred to as BALBOA.</p> <p>Results</p> <p>The efficacy of the BALBOA ORF classification technique is first assessed via cross validation and compared to a multi-class <it>k</it>-Nearest Neighbour (kNN) benchmark across three independent gene expression datasets. BALBOA is then used to assign putative functional annotations to unclassified yeast ORFs. These predictions are evaluated using existing experimental and protein sequence information. Lastly, we employ a related semi-supervised method to predict the presence of novel functional modules within yeast.</p> <p>Conclusion</p> <p>In this paper we demonstrate how unsupervised classification methods, such as bicluster analysis, may be extended using of available annotations to form semi-supervised approaches within the gene expression analysis domain. We show that such methods have the potential to improve upon supervised approaches and shed new light on the functions of unclassified ORFs and their co-regulation.</p

    RNA polymerase I complex structures elucidate mechanisms of transcription initiation and elongation

    Get PDF

    In vivo analysis of NHPX reveals a novel nucleolar localization pathway involving a transient accumulation in splicing speckles

    Get PDF
    The NHPX protein is a nucleolar factor that binds directly to a conserved RNA target sequence found in nucleolar box C/D snoRNAs and in U4 snRNA. Using enhanced yellow fluorescent protein (EYFP)– and enhanced cyan fluorescent protein–NHPX fusions, we show here that NHPX is specifically accumulated in both nucleoli and Cajal bodies (CBs) in vivo. The fusion proteins display identical localization patterns and RNA binding specificities to the endogenous NHPX. Analysis of a HeLa cell line stably expressing EYFP–NHPX showed that the nucleolar accumulation of NHPX was preceded by its transient accumulation in splicing speckles. Only newly expressed NHPX accumulated in speckles, and the nucleolar pool of NHPX did not interchange with the pool in speckles, consistent with a unidirectional pathway. The transient accumulation of NHPX in speckles prior to nucleoli was observed in multiple cell lines, including primary cells that lack CBs. Inhibitor studies indicated that progression of newly expressed NHPX from speckles to nucleoli was dependent on RNA polymerase II transcription, but not on RNA polymerase I activity. The data show a specific temporal pathway involving the sequential and directed accumulation of NHPX in distinct subnuclear compartments, and define a novel mechanism for nucleolar localization

    Integrative analysis of the Trypanosoma brucei gene expression cascade predicts differential regulation of mRNA processing and unusual control of ribosomal protein expression

    Get PDF
    Background: Trypanosoma brucei is a unicellular parasite which multiplies in mammals (bloodstream form) and Tsetse flies (procyclic form). Trypanosome RNA polymerase II transcription is polycistronic, individual mRNAs being excised by trans splicing and polyadenylation. We previously made detailed measurements of mRNA half-lives in bloodstream and procyclic forms, and developed a mathematical model of gene expression for bloodstream forms. At the whole transcriptome level, many bloodstream-form mRNAs were less abundant than was predicted by the model. Results: We refined the published mathematical model and extended it to the procyclic form. We used the model, together with known mRNA half-lives, to predict the abundances of individual mRNAs, assuming rapid, unregulated mRNA processing; then we compared the results with measured mRNA abundances. Remarkably, the abundances of most mRNAs in procyclic forms are predicted quite well by the model, being largely explained by variations in mRNA decay rates and length. In bloodstream forms substantially more mRNAs are less abundant than predicted. We list mRNAs that are likely to show particularly slow or inefficient processing, either in both forms or with developmental regulation. We also measured ribosome occupancies of all mRNAs in trypanosomes grown in the same conditions as were used to measure mRNA turnover. In procyclic forms there was a weak positive correlation between ribosome density and mRNA half-life, suggesting cross-talk between translation and mRNA decay; ribosome density was related to the proportion of the mRNA on polysomes, indicating control of translation initiation. Ribosomal protein mRNAs in procyclics appeared to be exceptionally rapidly processed but poorly translated. Conclusions: Levels of mRNAs in procyclic form trypanosomes are determined mainly by length and mRNA decay, with some control of precursor processing. In bloodstream forms variations in nuclear events play a larger role in transcriptome regulation, suggesting aquisition of new control mechanisms during adaptation to mammalian parasitism

    The mechanism and regulation of rRNA methylation by the Box C/D sRNP enzyme in solution

    Get PDF
    The biogenesis of the ribosome requires a series of essential modifications of ribosomal RNAs (rRNAs) and their precursor pre-rRNAs. The most abundant of such modifications is the methylation of the ribose 2 ́-OH, which occurs at over 100 rRNA sites in humans. rRNA methylation is known to increase the stability of the ribosome and to be required for accurate and efficient protein translation. While 2’-O methylation sites are known to cluster around the functional centres of the ribosome, the abundance of methylation at each site is known to vary, which may provide a mechanism to fine tune ribosomal function, creating specialized ribosome populations. In eukaryotes and archaea, rRNA 2’-O methylation is mediated by Box C/D ribonucleoprotein particles (RNPs). These particles, referred to as small nucleolar RNPs (snoRNPs) in eukaryotes and small RNPs (sRNPs) in archaea, use a guide RNA in order to direct the methylation of a specific nucleotide on the substrate rRNA. In archaea, each small guide RNA (sRNA) is responsible for the methylation of two rRNA sites using two different separate guide regions. Despite several structures of archaeal Box C/D sRNPs being available, the molecular basis for the regulation of the enzyme and the consequent generation of varying methylation abundances across different rRNA sites remains elusive. In order to understand the mechanism and regulation of the enzyme, I investigated the biochemical properties of archaeal Box C/D sRNPs reconstituted in vitro . Through a combination of biochemical and nuclear magnetic resonance (NMR)-based assays, I could show that archaeal RNPs catalyse the methylation of different substrate rRNA sites with varying degrees of efficiency and cooperativity. Furthermore, using low-resolution small angle scattering (SAS) techniques, I could show that addition of substrate RNAs onto some sRNPs is correlated with the complex undergoing a transition between different oligomeric and/or conformational states, thereby contextualising the multiple sRNP structures observed in previous studies. In the second part of my work, I used a combination of distance restraints derived from NMR and low-resolution information from SAS to obtain the structures of an archaeal sRNP bound to either of its two substrate RNAs by an integrative structural biology approach. As this particle contains flexible regions, the work required the development of a novel algorithm capable of dealing with NMR/SAS signals arising from ensembles, rather than single conformers. Using this tool, I could derive the populations of conformers within ensembles of RNPs bound to different substrate RNAs, which provide a structural basis for the varying methylation efficiency of the enzyme. Ultimately, the work presented here provides a model for understanding one of the mechanism through which specialised ribosome populations are generated in vivo and contributes to the development of novel techniques for integrative structure modelling of flexible systems
    corecore