7,603 research outputs found
UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets
Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research
Programme (Grant Reference Number RP-PG-0310-1004)
Modelling co-transcriptional cleavage in the synthesis of yeast pre-rRNA
AbstractIn this paper we present a quantified model of the synthesis of pre-rRNAs in yeast. The chemical kinetics simulation software Dizzy has been used as both the modelling and simulation framework of our study. The simulations have been used to investigate the mechanism of co-transcriptional cleavage which can occur during the synthesis of pre-rRNAs.Throughout the paper we emphasise the strong role of experimental data both in shaping the model and in guiding the analysis which is carried out. Parameter estimation procedures have been used to fit the model to the data and we discuss the validation of the model against the available experimental data. Simulation based on Gillespie’s algorithm is considered to be the reference method for our analysis and a comparison with other simulators is reported. Finally, we define an extended model, that relaxes one of the assumptions of the initial model
PIN domain of Nob1p is required for D-site cleavage in 20S pre-rRNA
Nob1p (Yor056c) is essential for processing of the 20S pre-rRNA to the mature 18S rRNA. It is part of a pre-40S ribosomal particle that is transported to the cytoplasm and subsequently cleaved at the 3' end of mature 18S rRNA (D-site). Nob1p is also reported to participate in proteasome biogenesis, and it was therefore unclear whether its primary activity is in ribosome synthesis. In this work, we describe a homology model of the PIN domain of Nob1p, which structurally mimics Mg(2+)-dependent exonucleases despite negligible similarity in primary sequence. Insights gained from this model were used to design a point mutation that was predicted to abolish the postulated enzymatic activity. Cells expressing Nob1p with this mutation failed to cleave the 20S pre-rRNA. This supports both the significance of the structural model and the idea that Nob1p is the long-sought D-site endonuclease
Structural basis for the binding of IRES RNAs to the head of the ribosomal 40S subunit
Some viruses exploit internal initiation for their propagation in the host cell. This type of initiation is facilitated by structured elements (internal ribosome entry site, IRES) upstream of the initiator AUG and requires only a reduced number of canonical initiation factors. An important example are IRES of the virus family Dicistroviridae that bind to the inter-subunit side of the small ribosomal 40S subunit and lead to the formation of elongation-competent 80S ribosomes without the help of any initiation factor. Here, we present a comprehensive functional and structural analysis of eukaryotic-specific ribosomal protein rpS25 in the context of this type of initiation and propose a structural model explaining the essential involvement of rpS25 for hijacking the ribosome
Getting Ready to Translate: Cytoplasmic Maturation of Eukaryotic Ribosomes
The ribosome is the 'universal ribozyme' that is responsible for the final step of decoding genetic information into proteins. While the function of the ribosome is being elucidated at the atomic level, in comparison, little is known regarding its assembly in vivo and intracellular
transport. In contrast to prokaryotic ribosomes, the construction of eukaryotic ribosomes, which begins in the nucleolus, requires >200 evolutionary conserved non-ribosomal trans-acting factors, which transiently associate with pre-ribosomal subunits at distinct assembly stages and
perform specific maturation steps. Notably, pre-ribosomal subunits are transported to the cytoplasm in a functionally inactive state where they undergo maturation prior to entering translation. In this review, I will summarize our current knowledge of the eukaryotic ribosome assembly pathway
with emphasis on cytoplasmic maturation events that render pre-ribosomal subunits translation competent
Extending bicluster analysis to annotate unclassified ORFs and predict novel functional modules using expression data
<p>Abstract</p> <p>Background</p> <p>Microarrays have the capacity to measure the expressions of thousands of genes in parallel over many experimental samples. The unsupervised classification technique of bicluster analysis has been employed previously to uncover gene expression correlations over subsets of samples with the aim of providing a more accurate model of the natural gene functional classes. This approach also has the potential to aid functional annotation of unclassified open reading frames (ORFs). Until now this aspect of biclustering has been under-explored. In this work we illustrate how bicluster analysis may be extended into a 'semi-supervised' ORF annotation approach referred to as BALBOA.</p> <p>Results</p> <p>The efficacy of the BALBOA ORF classification technique is first assessed via cross validation and compared to a multi-class <it>k</it>-Nearest Neighbour (kNN) benchmark across three independent gene expression datasets. BALBOA is then used to assign putative functional annotations to unclassified yeast ORFs. These predictions are evaluated using existing experimental and protein sequence information. Lastly, we employ a related semi-supervised method to predict the presence of novel functional modules within yeast.</p> <p>Conclusion</p> <p>In this paper we demonstrate how unsupervised classification methods, such as bicluster analysis, may be extended using of available annotations to form semi-supervised approaches within the gene expression analysis domain. We show that such methods have the potential to improve upon supervised approaches and shed new light on the functions of unclassified ORFs and their co-regulation.</p
In vivo analysis of NHPX reveals a novel nucleolar localization pathway involving a transient accumulation in splicing speckles
The NHPX protein is a nucleolar factor that binds directly to a conserved RNA target sequence found in nucleolar box C/D snoRNAs and in U4 snRNA. Using enhanced yellow fluorescent protein (EYFP)– and enhanced cyan fluorescent protein–NHPX fusions, we show here that NHPX is specifically accumulated in both nucleoli and Cajal bodies (CBs) in vivo. The fusion proteins display identical localization patterns and RNA binding specificities to the endogenous NHPX. Analysis of a HeLa cell line stably expressing EYFP–NHPX showed that the nucleolar accumulation of NHPX was preceded by its transient accumulation in splicing speckles. Only newly expressed NHPX accumulated in speckles, and the nucleolar pool of NHPX did not interchange with the pool in speckles, consistent with a unidirectional pathway. The transient accumulation of NHPX in speckles prior to nucleoli was observed in multiple cell lines, including primary cells that lack CBs. Inhibitor studies indicated that progression of newly expressed NHPX from speckles to nucleoli was dependent on RNA polymerase II transcription, but not on RNA polymerase I activity. The data show a specific temporal pathway involving the sequential and directed accumulation of NHPX in distinct subnuclear compartments, and define a novel mechanism for nucleolar localization
Integrative analysis of the Trypanosoma brucei gene expression cascade predicts differential regulation of mRNA processing and unusual control of ribosomal protein expression
Background: Trypanosoma brucei is a unicellular parasite which multiplies in mammals (bloodstream form) and Tsetse flies (procyclic form). Trypanosome RNA polymerase II transcription is polycistronic, individual mRNAs being excised by trans splicing and polyadenylation. We previously made detailed measurements of mRNA half-lives in bloodstream and procyclic forms, and developed a mathematical model of gene expression for bloodstream forms. At the whole transcriptome level, many bloodstream-form mRNAs were less abundant than was predicted by the model. Results: We refined the published mathematical model and extended it to the procyclic form. We used the model, together with known mRNA half-lives, to predict the abundances of individual mRNAs, assuming rapid, unregulated mRNA processing; then we compared the results with measured mRNA abundances. Remarkably, the abundances of most mRNAs in procyclic forms are predicted quite well by the model, being largely explained by variations in mRNA decay rates and length. In bloodstream forms substantially more mRNAs are less abundant than predicted. We list mRNAs that are likely to show particularly slow or inefficient processing, either in both forms or with developmental regulation. We also measured ribosome occupancies of all mRNAs in trypanosomes grown in the same conditions as were used to measure mRNA turnover. In procyclic forms there was a weak positive correlation between ribosome density and mRNA half-life, suggesting cross-talk between translation and mRNA decay; ribosome density was related to the proportion of the mRNA on polysomes, indicating control of translation initiation. Ribosomal protein mRNAs in procyclics appeared to be exceptionally rapidly processed but poorly translated. Conclusions: Levels of mRNAs in procyclic form trypanosomes are determined mainly by length and mRNA decay, with some control of precursor processing. In bloodstream forms variations in nuclear events play a larger role in transcriptome regulation, suggesting aquisition of new control mechanisms during adaptation to mammalian parasitism
The mechanism and regulation of rRNA methylation by the Box C/D sRNP enzyme in solution
The biogenesis of the ribosome requires a series of essential modifications of ribosomal
RNAs (rRNAs) and their precursor pre-rRNAs. The most abundant of such modifications is
the methylation of the ribose 2 ́-OH, which occurs at over 100 rRNA sites in humans. rRNA
methylation is known to increase the stability of the ribosome and to be required for accurate
and efficient protein translation. While 2’-O methylation sites are known to cluster around the
functional centres of the ribosome, the abundance of methylation at each site is known to
vary, which may provide a mechanism to fine tune ribosomal function, creating specialized
ribosome populations.
In
eukaryotes
and
archaea,
rRNA
2’-O
methylation is mediated by Box C/D
ribonucleoprotein particles (RNPs). These particles, referred to as small nucleolar RNPs
(snoRNPs) in eukaryotes and small RNPs (sRNPs) in archaea, use a guide RNA in order to
direct the methylation of a specific nucleotide on the substrate rRNA. In archaea, each small
guide RNA (sRNA) is responsible for the methylation of two rRNA sites using two different
separate guide regions.
Despite several structures of archaeal Box C/D sRNPs being available, the molecular basis
for the regulation of the enzyme and the consequent generation of varying methylation
abundances across different rRNA sites remains elusive.
In order to understand the mechanism and regulation of the enzyme, I investigated the
biochemical properties of archaeal Box C/D sRNPs reconstituted in vitro . Through a
combination of biochemical and nuclear magnetic resonance (NMR)-based assays, I could
show that archaeal RNPs catalyse the methylation of different substrate rRNA sites with
varying degrees of efficiency and cooperativity. Furthermore, using low-resolution small
angle scattering (SAS) techniques, I could show that addition of substrate RNAs onto some
sRNPs is correlated with the complex undergoing a transition between different oligomeric
and/or conformational states, thereby contextualising the multiple sRNP structures observed
in previous studies.
In the second part of my work, I used a combination of distance restraints derived from NMR
and low-resolution information from SAS to obtain the structures of an archaeal sRNP bound
to either of its two substrate RNAs by an integrative structural biology approach. As this
particle contains flexible regions, the work required the development of a novel algorithm
capable of dealing with NMR/SAS signals arising from ensembles, rather than single
conformers. Using this tool, I could derive the populations of conformers within ensembles of
RNPs bound to different substrate RNAs, which provide a structural basis for the varying
methylation efficiency of the enzyme.
Ultimately, the work presented here provides a model for understanding one of the
mechanism through which specialised ribosome populations are generated in vivo and
contributes to the development of novel techniques for integrative structure modelling of
flexible systems
- …