3,344 research outputs found

    Representative transcript sets for evaluating a translational initiation sites predictor

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Translational initiation site (TIS) prediction is a very important and actively studied topic in bioinformatics. In order to complete a comparative analysis, it is desirable to have several benchmark data sets which can be used to test the effectiveness of different algorithms. An ideal benchmark data set should be reliable, representative and readily available. Preferably, proteins encoded by members of the data set should also be representative of the protein population actually expressed in cellular specimens.</p> <p>Results</p> <p>In this paper, we report a general algorithm for constructing a reliable sequence collection that only includes mRNA sequences whose corresponding protein products present an average profile of the general protein population of a given organism, with respect to three major structural parameters. Four representative transcript collections, each derived from a model organism, have been obtained following the algorithm we propose. Evaluation of these data sets shows that they are reasonable representations of the spectrum of proteins obtained from cellular proteomic studies. Six state-of-the-art predictors have been used to test the usefulness of the construction algorithm that we proposed. Comparative study which reports the predictors' performance on our data set as well as three other existing benchmark collections has demonstrated the actual merits of our data sets as benchmark testing collections.</p> <p>Conclusion</p> <p>The proposed data set construction algorithm has demonstrated its property of being a general and widely applicable scheme. Our comparison with published proteomic studies has shown that the expression of our data set of transcripts generates a polypeptide population that is representative of that obtained from evaluation of biological specimens. Our data set thus represents "real world" transcripts that will allow more accurate evaluation of algorithms dedicated to identification of TISs, as well as other translational regulatory motifs within mRNA sequences. The algorithm proposed by us aims at compiling a redundancy-free data set by removing redundant copies of homologous proteins. The existence of such data sets may be useful for conducting statistical analyses of protein sequence-structure relations. At the current stage, our approach's focus is to obtain an "average" protein data set for any particular organism without posing much selection bias. However, with the three major protein structural parameters deeply integrated into the scheme, it would be a trivial task to extend the current method for obtaining a more selective protein data set, which may facilitate the study of some particular protein structure.</p

    Impact of Nonsense-Mediated mRNA Decay on the Global Expression Profile of Budding Yeast

    Get PDF
    Nonsense-mediated mRNA decay (NMD) is a eukaryotic mechanism of RNA surveillance that selectively eliminates aberrant transcripts coding for potentially deleterious proteins. NMD also functions in the normal repertoire of gene expression. In Saccharomyces cerevisiae, hundreds of endogenous RNA Polymerase II transcripts achieve steady-state levels that depend on NMD. For some, the decay rate is directly influenced by NMD (direct targets). For others, abundance is NMD-sensitive but without any effect on the decay rate (indirect targets). To distinguish between direct and indirect targets, total RNA from wild-type (Nmd(+)) and mutant (Nmd(−)) strains was probed with high-density arrays across a 1-h time window following transcription inhibition. Statistical models were developed to describe the kinetics of RNA decay. 45% ± 5% of RNAs targeted by NMD were predicted to be direct targets with altered decay rates in Nmd(−) strains. Parallel experiments using conventional methods were conducted to empirically test predictions from the global experiment. The results show that the global assay reliably distinguished direct versus indirect targets. Different types of targets were investigated, including transcripts containing adjacent, disabled open reading frames, upstream open reading frames, and those prone to out-of-frame initiation of translation. Known targeting mechanisms fail to account for all of the direct targets of NMD, suggesting that additional targeting mechanisms remain to be elucidated. 30% of the protein-coding targets of NMD fell into two broadly defined functional themes: those affecting chromosome structure and behavior and those affecting cell surface dynamics. Overall, the results provide a preview for how expression profiles in multi-cellular eukaryotes might be impacted by NMD. Furthermore, the methods for analyzing decay rates on a global scale offer a blueprint for new ways to study mRNA decay pathways in any organism where cultured cell lines are available

    Translational regulation of rhythmic and constitutive gene expression

    Get PDF

    Development and Application of Next-Generation Sequencing Methods to Profile Cellular Translational Dynamics

    Full text link
    The transmission of genetic information from the transcription of DNA to RNA and the subsequent translation of RNA into protein is often abstracted into a linear process. However, as methods and technologies to measure the genomic, transcriptomic, and proteomic content of cells have advanced, so too has our understanding that the transmission of genetic information does not always flow in a lossless manner. For instance, changes observed in messenger RNA (mRNA) abundance are not always retained at the proteomic level. Indeed, a diverse array of mechanisms have been identified that exert regulatory control over this transmission of information. Next-generation short read sequencing has driven many of these insights and provided increasingly nuanced understanding of these regulatory mechanisms. However, the continued development and application of sequencing methodologies and analytics are required to properly contextualize many of these insights on a more global scale. Ribosome profiling is one such recent advancement which enriches for ribosome-protected fragments of mRNA; sequencing and analysis of these ribosome-protected mRNA fragments enables profiling of the translational content of a sample. The aim of this dissertation is to address the need for the development and application of statistical and analytical algorithms to profile the regulatory factors that contribute to the translational dynamics in cells. In the first chapter, I survey the development and application of next-generation sequencing methods for the profiling and computational analysis of translation and translational dynamics. In the second chapter of this thesis, I present SPECtre, a software package that identifies regions of active translation through measurement of the translational engagement of ribosomes over a transcript. SPECtre achieves high sensitivity and specificity in its classification of regions undergoing translation by leveraging the codon-dependent elongation of peptides; this tri-nucleotide periodicity is evident in the alignment of ribosome profiling sequence reads to a reference transcriptome. SPECtre classifies actively translated transcripts according to their coherence in read coverage over a region to an optimal tri-nucleotide signal. In the third chapter, I describe the application of SPECtre to identify the translation of upstream-initiated open-reading frames that may regulate differentiation in a neuron-like cell model. uORFs are transcripts that result from the initiation of translation from AUG, and under certain biological constraints, from non-AUG sequences localized in the 5’ untranslated regions of annotated protein-coding genes. Subsets of these uORFs have been implicated in the regulation of their downstream protein-coding genes in yeast, mice and humans. In this chapter, I provide further evidence for this regulation as well as the spatial context for the functional consequences of uORF translation on downstream protein-coding genes in a neuron-like cell line model of differentiation. Finally, in the fourth chapter, I outline a strategy using our coherence-based translational scoring algorithm to profile ribosomal engagement over chimeric gene fusion breakpoints in prostate cancer. Here, known breakpoints from current annotation databases are integrated with novel junctions nominated by existing whole genome and transcriptomic gene fusion detection algorithms, and the translational profile over these chimeric junctions using SPECtre is measured. This provides an additional layer of translational evidence to known and novel gene fusion breakpoints in prostate cancer. Ongoing development of a database and visualization platform based on these results will enable integrative insights into the transcriptional and translational topology of these breakpoints.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144106/1/stonyc_1.pd

    The RNA-binding protein LARP1 as potential biomarker and therapeutic target in ovarian cancer

    Get PDF
    Ovarian cancer is the most lethal gynaecological malignancy, responsible for over 4,000 deaths each year in the UK. There is growing evidence that mRNA-binding proteins (RBPs) can be post-transcriptional drivers of cancer progression. Here, I investigated the expression of the RBP LARP1 in ovarian malignancies and role of the protein in ovarian cancer cell biology. LARP1 is highly expressed at both an mRNA and protein level in ovarian cancers compared with benign tumours and normal ovarian tissue. I show that higher levels of LARP1 in tumour tissue are predictive of poor patient survival. Consistent with this clinical finding, in xenograft studies knockdown of LARP1 expression causes a dramatic reduction in tumour growth. In vitro, LARP1 knockdown is associated with increased apoptosis, and is sufficient to restore platinum sensitivity in chemotherapy-resistant cell lines. Furthermore, LARP1 is required to maintain cancer stem cell marker-positive populations, and knockdown decreases tumour-initiating potential, as demonstrated by in vivo limiting dilution assays. Transcriptome deep-sequencing following LARP1 knockdown revealed altered expression of multiple genes linked to survival and evasion of apoptosis, including BCL2 and BIK. Transcripts of both genes are in complex with LARP1 protein, and LARP1 maintains the stability of BCL2 mRNA, whilst actively destabilising BIK transcripts. This effect is mediated at the level of the 3’ untranslated region. I therefore conclude that by differentially regulating mRNA stability, LARP1 is a key post-transcriptional driver of tumourigenicity and cell survival in ovarian cancer.Open Acces

    Loss of Nmp4 optimizes osteogenic metabolism and secretion to enhance bone quality

    Get PDF
    A goal of osteoporosis therapy is to restore lost bone with structurally sound tissue. Mice lacking the transcription factor Nuclear Matrix Protein 4 (Nmp4, Zfp384, Ciz, ZNF384) respond to several classes of osteoporosis drugs with enhanced bone formation compared to wild type (WT) animals. Nmp4-/- mesenchymal stem/progenitor cells (MSPCs) exhibit an accelerated and enhanced mineralization during osteoblast differentiation. To address the mechanisms underlying this hyper-anabolic phenotype, we carried out RNA-sequencing and molecular and cellular analyses of WT and Nmp4-/- MSPCs during osteogenesis to define pathways and mechanisms associated with elevated matrix production. We determined that Nmp4 has a broad impact on the transcriptome during osteogenic differentiation, contributing to the expression of over 5,000 genes. Phenotypic anchoring of transcriptional data was performed for the hypothesis-testing arm through analysis of cell metabolism, protein synthesis and secretion, and bone material properties. Mechanistic studies confirmed that Nmp4-/- MSPCs exhibited an enhanced capacity for glycolytic conversion- a key step in bone anabolism. Nmp4-/- cells showed elevated collagen translation and secretion. Expression of matrix genes that contribute to bone material-level mechanical properties were elevated in Nmp4-/- cells, an observation that was supported by biomechanical testing of bone samples from Nmp4-/- and WT mice. We conclude that loss of Nmp4 increases the magnitude of glycolysis upon the metabolic switch, which fuels the conversion of the osteoblast into a super-secretor of matrix resulting in more bone with improvements in intrinsic quality

    Protein-coding gene promoters in Methanocaldococcus (Methanococcus) jannaschii

    Get PDF
    Although Methanocaldococcus (Methanococcus) jannaschii was the first archaeon to have its genome sequenced, little is known about the promoters of its protein-coding genes. To expand our knowledge, we have experimentally identified 131 promoters for 107 protein-coding genes in this genome by mapping their transcription start sites. Compared to previously identified promoters, more than half of which are from genes for stable RNAs, the protein-coding gene promoters are qualitatively similar in overall sequence pattern, but statistically different at several positions due to greater variation among their sequences. Relative binding affinity for general transcription factors was measured for 12 of these promoters by competition electrophoretic mobility shift assays. These promoters bind the factors less tightly than do most tRNA gene promoters. When a position weight matrix (PWM) was constructed from the protein gene promoters, factor binding affinities correlated with corresponding promoter PWM scores. We show that the PWM based on our data more accurately predicts promoters in the genome and transcription start sites than could be done with the previously available data. We also introduce a PWM logo, which visually displays the implications of observing a given base at a position in a sequence
    corecore