43 research outputs found
Recommended from our members
Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila
MicroRNAs (miRNAs) are a class of 20–23 nucleotide small RNAs that regulate gene expression post-transcriptionally in animals and plants. Annotation of miRNAs by the miRNA database (miRBase) has largely relied on computational approaches. As a result, many miRBase entries lack experimental validation, and discrepancies between miRBase annotation and actual miRNA sequences are often observed. In this study, we integrated the small RNA sequencing (smRNA-seq) datasets in Caenorhabditis elegans and Drosophila melanogaster and devised an analytical pipeline coupled with detailed manual inspection to curate miRNA annotation systematically in miRBase. Our analysis reveals 19 (17.0%) and 51 (31.3%) miRNAs entries with detectable smRNA-seq reads have mature sequence discrepancies in C. elegans and D. melanogaster, respectively. These discrepancies frequently occur either for conserved miRNA families whose mature sequences were predicted according to their homologous counterparts in other species or for miRNAs whose precursor miRNA (pre-miRNA) hairpins produce an abundance of multiple miRNA isoforms or variants. Our analysis shows that while Drosophila pre-miRNAs, on average, produce less than 60% accurate mature miRNA reads in addition to their 5′ and 3′ variant isoforms, the precision of miRNA processing in C. elegans is much higher, at over 90%. Based on the revised miRNA sequences, we analyzed expression patterns of the more conserved (MC) and less conserved (LC) miRNAs and found that, whereas MC miRNAs are often co-expressed at multiple developmental stages, LC miRNAs tend to be expressed specifically at fewer stages
Recommended from our members
Inference of transcriptional regulation in cancers
We developed an efficient and accurate computational framework, RABIT (regression analysis with background integration), and comprehensively integrated public transcription factor (TF)-binding profiles with TCGA tumor-profiling datasets in 18 cancer types. To systematically search for cancer-associated TFs, RABIT controls the effect of tumor-confounding factors on transcriptional regulation, such as copy number alteration, DNA methylation, and TF somatic mutation. Our predicted TF regulatory activity in tumors is highly consistent with the knowledge from cancer gene databases and reveals many previously unidentified cancer-associated TFs. We also analyzed RNA-binding protein regulation in cancer and demonstrated that RABIT is a general platform for predicting oncogenic gene expression regulators
Recommended from our members
DiNuP: a systematic approach to identify regions of differential nucleosome positioning
Motivation: With the rapid development of high-throughput sequencing technologies, the genome-wide profiling of nucleosome positioning has become increasingly affordable. Many future studies will investigate the dynamic behaviour of nucleosome positioning in cells that have different states or that are exposed to different conditions. However, a robust method to effectively identify the regions of differential nucleosome positioning (RDNPs) has not been previously available. Results:: We describe a novel computational approach, DiNuP, that compares nucleosome profiles generated by high-throughput sequencing under various conditions. DiNuP provides a statistical P-value for each identified RDNP based on the difference of read distributions. DiNuP also empirically estimates the false discovery rate as a cutoff when two samples have different sequencing depths and differentiate reliable RDNPs from the background noise. Evaluation of DiNuP showed it to be both sensitive and specific for the detection of changes in nucleosome location, occupancy and fuzziness. RDNPs that were identified using publicly available datasets revealed that nucleosome positioning dynamics are closely related to the epigenetic regulation of transcription. Availability and implementation: DiNuP is implemented in Python and is freely available at http://www.tongji.edu.cn/~zhanglab/DiNuP
MM-ChIP enables integrative analysis of cross-platform and between-laboratory ChIP-chip or ChIP-seq data
The ChIP-chip and ChIP-seq techniques enable genome-wide mapping of in vivo protein-DNA interactions and chromatin states. The cross-platform and between-laboratory variation poses a challenge to the comparison and integration of results from different ChIP experiments. We describe a novel method, MM-ChIP, which integrates information from cross-platform and between-laboratory ChIP-chip or ChIP-seq datasets. It improves both the sensitivity and the specificity of detecting ChIP-enriched regions, and is a useful meta-analysis tool for driving discoveries from multiple data sources
Recommended from our members
MM-ChIP enables integrative analysis of cross-platform and between-laboratory ChIP-chip or ChIP-seq data
The ChIP-chip and ChIP-seq techniques enable genome-wide mapping of in vivo protein-DNA interactions and chromatin states. The cross-platform and between-laboratory variation poses a challenge to the comparison and integration of results from different ChIP experiments. We describe a novel method, MM-ChIP, which integrates information from cross-platform and between-laboratory ChIP-chip or ChIP-seq datasets. It improves both the sensitivity and the specificity of detecting ChIP-enriched regions, and is a useful meta-analysis tool for driving discoveries from multiple data sources
Recommended from our members
Computational inference of mRNA stability from histone modification and transcriptome profiles
Histone modifications play important roles in regulating eukaryotic gene expression and have been used to model expression levels. Here, we present a regression model to systematically infer mRNA stability by comparing transcriptome profiles with ChIP-seq of H3K4me3, H3K27me3 and H3K36me3. The results from multiple human and mouse cell lines show that the inferred unstable mRNAs have significantly longer 3′Untranslated Regions (UTRs) and more microRNA binding sites within 3′UTR than the inferred stable mRNAs. Regression residuals derived from RNA-seq, but not from GRO-seq, are highly correlated with the half-lives measured by pulse-labeling experiments, supporting the rationale of our inference. Whereas, the functions enriched in the inferred stable and unstable mRNAs are consistent with those from pulse-labeling experiments, we found the unstable mRNAs have higher cell-type specificity under functional constraint. We conclude that the systematical use of histone modifications can differentiate non-expressed mRNAs from unstable mRNAs, and distinguish stable mRNAs from highly expressed ones. In summary, we represent the first computational model of mRNA stability inference that compares transcriptome and epigenome profiles, and provides an alternative strategy for directing experimental measurements
Recommended from our members
Intestinal Master Transcription Factor CDX2 Controls Chromatin Access for Partner Transcription Factor Binding
Tissue-specific gene expression requires modulation of nucleosomes, allowing transcription factors to occupy cis elements that are accessible only in selected tissues. Master transcription factors control cell-specific genes and define cellular identities, but it is unclear if they possess special abilities to regulate cell-specific chromatin and if such abilities might underlie lineage determination and maintenance. One prevailing view is that several transcription factors enable chromatin access in combination. The homeodomain protein CDX2 specifies the embryonic intestinal epithelium, through unknown mechanisms, and partners with transcription factors such as HNF4A in the adult intestine. We examined enhancer chromatin and gene expression following Cdx2 or Hnf4a excision in mouse intestines. HNF4A loss did not affect CDX2 binding or chromatin, whereas CDX2 depletion modified chromatin significantly at CDX2-bound enhancers, disrupted HNF4A occupancy, and abrogated expression of neighboring genes. Thus, CDX2 maintains transcription-permissive chromatin, illustrating a powerful and dominant effect on enhancer configuration in an adult tissue. Similar, hierarchical control of cell-specific chromatin states is probably a general property of master transcription factors
Recommended from our members
Sequence determinants of improved CRISPR sgRNA design
The CRISPR/Cas9 system has revolutionized mammalian somatic cell genetics. Genome-wide functional screens using CRISPR/Cas9-mediated knockout or dCas9 fusion-mediated inhibition/activation (CRISPRi/a) are powerful techniques for discovering phenotype-associated gene function. We systematically assessed the DNA sequence features that contribute to single guide RNA (sgRNA) efficiency in CRISPR-based screens. Leveraging the information from multiple designs, we derived a new sequence model for predicting sgRNA efficiency in CRISPR/Cas9 knockout experiments. Our model confirmed known features and suggested new features including a preference for cytosine at the cleavage site. The model was experimentally validated for sgRNA-mediated mutation rate and protein knockout efficiency. Tested on independent data sets, the model achieved significant results in both positive and negative selection conditions and outperformed existing models. We also found that the sequence preference for CRISPRi/a is substantially different from that for CRISPR/Cas9 knockout and propose a new model for predicting sgRNA efficiency in CRISPRi/a experiments. These results facilitate the genome-wide design of improved sgRNA for both knockout and CRISPRi/a studies
Protein Kinase C α Is a Central Signaling Node and Therapeutic Target for Breast Cancer Stem Cells
The epithelial-mesenchymal transition program becomes activated during malignant progression and can enrich for cancer stem cells (CSCs). We report that inhibition of protein kinase C α (PKCα) specifically targets CSCs but has little effect on non-CSCs. The formation of CSCs from non-stem cells involves a shift from EGFR to PDGFR signaling and results in the PKCα-dependent activation of FRA1. We identified an AP-1 molecular switch in which c-FOS and FRA1 are preferentially utilized in non-CSCs and CSCs, respectively. PKCα and FRA1 expression is associated with the aggressive triple-negative breast cancers, and the depletion of FRA1 results in a mesenchymal-epithelial transition. Hence, identifying molecular features that shift between cell states can be exploited to target signaling components critical to CSCs.National Cancer Institute (U.S.) (Grant P01-CA080111)National Institutes of Health (U.S.) (Grant R01-CA078461