8 research outputs found
RBPSponge: genome-wide identification of lncRNAs that sponge RBPs
Long non-coding RNAs (lncRNAs) can act as molecular sponge or decoys for an RNA-binding protein (RBP) through their RBP-binding sites, thereby modulating the expression of all target genes of the corresponding RBP of interest. Here, we present a web tool named RBPSponge to explore lncRNAs based on their potential to act as a sponge for an RBP of interest. RBPSponge identifies the occurrences of RBP-binding sites and CLIP peaks on lncRNAs, and enables users to run statistical analyses to investigate the regulatory network between lncRNAs, RBPs and targets of RBPs.No sponso
Predicting clinical outcomes in neuroblastoma with genomic data integration
Background: Neuroblastoma is a heterogeneous disease with diverse clinical outcomes. Current risk group models require improvement as patients within the same risk group can still show variable prognosis. Recently collected genome-wide datasets provide opportunities to infer neuroblastoma subtypes in a more unified way. Within this context, data integration is critical as different molecular characteristics can contain complementary signals. To this end, we utilized the genomic datasets available for the SEQC cohort patients to develop supervised and unsupervised models that can predict disease prognosis.
Results: Our supervised model trained on the SEQC cohort can accurately predict overall survival and event-free survival profiles of patients in two independent cohorts. We also performed extensive experiments to assess the prediction accuracy of high risk patients and patients without MYCN amplification. Our results from this part suggest that clinical endpoints can be predicted accurately across multiple cohorts. To explore the data in an unsupervised manner, we used an integrative clustering strategy named multi-view kernel k-means (MVKKM) that can effectively integrate multiple high-dimensional datasets with varying weights. We observed that integrating different gene expression datasets results in a better patient stratification compared to using these datasets individually. Also, our identified subgroups provide a better Cox regression model fit compared to the existing risk group definitions.
Conclusion: Altogether, our results indicate that integration of multiple genomic characterizations enables the discovery of subtypes that improve over existing definitions of risk groups. Effective prediction of survival times will have a direct impact on choosing the right therapies for patients.No sponso
Modeling the combined effect of RNA-binding proteins and microRNAs in post-transcriptional regulation
Recent studies show that RNA-binding proteins (RBPs) and microRNAs (miRNAs) function in coordination with each other to control post-transcriptional regulation (PTR). Despite this, the majority of research to date has focused on the regulatory effect of individual RBPs or miRNAs. Here, we mapped both RBP and miRNA binding sites on human 3′UTRs and utilized this collection to better understand PTR. We show that the transcripts that lack competition for HuR binding are destabilized more after HuR depletion. We also confirm this finding for PUM1(2) by measuring genome-wide expression changes following the knockdown of PUM1(2) in HEK293 cells. Next, to find potential cooperative interactions, we identified the pairs of factors whose sites co-localize more often than expected by random chance. Upon examining these results for PUM1(2), we found that transcripts where the sites of PUM1(2) and its interacting miRNA form a stem-loop are more stabilized upon PUM1(2) depletion. Finally, using dinucleotide frequency and counts of regulatory sites as features in a regression model, we achieved an AU-ROC of 0.86 in predicting mRNA half-life in BEAS-2B cells. Altogether, our results suggest that future studies of PTR must consider the combined effects of RBPs and miRNAs, as well as their interactions.No sponso
Systematic assessment of long-read RNA-seq methods for transcript identification and quantification
The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis
Systematic assessment of long-read RNA-seq methods for transcript identification and quantification
The Long-read RNA-Seq Genome Annotation Assessment Project Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. Using different protocols and sequencing platforms, the consortium generated over 427 million long-read sequences from complementary DNA and direct RNA datasets, encompassing human, mouse and manatee species. Developers utilized these data to address challenges in transcript isoform detection, quantification and de novo transcript detection. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. Incorporating additional orthogonal data and replicate samples is advised when aiming to detect rare and novel transcripts or using reference-free approaches. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis
Transkripsiyon-sonrası kontrolde RNA’ya bağlanan protein’ler ve mikrorna’ların etkisinin birlikte modellenmesi.
Post-transcriptional regulation (PTR) controls the gene expression between transcription and translation. Regulation at this level is carried out by the interactions of trans-acting RNA-binding proteins (RBPs) and microRNAs (miRNAs) with cis-regulatory elements in mRNA. Majority of previous work have focused on the effect of a single factor independent of other co-factors bound to the same mRNA. However, recent studies have shown that RBPs and miRNAs can act in cooperation or competition with each other. In this thesis, we mapped the binding sites of both RBPs and miRNAs on human 3’UTRs, and utilized this collection of binding sites to better understand PTR networks. We first focused on several RBPs and assessed how accessibility and conservation differ between experimentally supported sites and other sites that are only computationally predicted. We then investigated the competitive effects of other factors on HuR binding and the resulting transcript abundance change upon HuR depletion. Next, we characterized the potential interactions between the factors by finding those pairs of factors with co-occurrence of motifs higher than expected by chance. Our results show that PUM1 and PUM2 have potential cooperative interactions with miRNAs. Finally, we used logistic regression with features compiled from the counts of sites of factors and dinucleotide frequency to accurately predict the stability and steady-state abundance of mRNAs. Altogether, results of this thesis suggest that studies of PTR must consider the effect of both RBPs and miRNAs, and their interactions.M.S. - Master of Scienc