29 research outputs found

    GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases

    Get PDF
    BACKGROUND: Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed. RESULTS: Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction algorithm permitting the identification of genomic sequences with unexpected cleavage activity. CONCLUSION: The GUIDEseq package enables analysis of GUIDE-data from various nuclease platforms for any species with a defined genomic sequence. This software package has been used successfully to analyze several GUIDE-seq datasets. The software, source code and documentation are freely available at http://www.bioconductor.org/packages/release/bioc/html/GUIDEseq.html

    An improved zebrafish transcriptome annotation for sensitive and comprehensive detection of cell type-specific genes

    Get PDF
    The zebrafish is ideal for studying embryogenesis and is increasingly applied to model human disease. In these contexts, RNA-sequencing (RNA-seq) provides mechanistic insights by identifying transcriptome changes between experimental conditions. Application of RNA-seq relies on accurate transcript annotation for a genome of interest. Here, we find discrepancies in analysis from RNA-seq datasets quantified using Ensembl and RefSeq zebrafish annotations. These issues were due, in part, to variably annotated 3\u27 untranslated regions and thousands of gene models missing from each annotation. Since these discrepancies could compromise downstream analyses and biological reproducibility, we built a more comprehensive zebrafish transcriptome annotation that addresses these deficiencies. Our annotation improves detection of cell type-specific genes in both bulk and single cell RNA-seq datasets, where it also improves resolution of cell clustering. Thus, we demonstrate that our new transcriptome annotation can outperform existing annotations, providing an important resource for zebrafish researchers

    Simultaneous generation of many RNA-seq libraries in a single reaction

    Get PDF
    Although RNA-seq is a powerful tool, the considerable time and cost associated with library construction has limited its utilization for various applications. RNAtag-Seq, an approach to generate multiple RNA-seq libraries in a single reaction, lowers time and cost per sample, and it produces data on prokaryotic and eukaryotic samples that are comparable to those generated by traditional strand-specific RNA-seq approaches

    Dietary suppression of MHC-II expression in intestinal stem cells enhances intestinal tumorigenesis

    Get PDF
    Little is known about how interactions between diet, immune recognition, and intestinal stem cells (ISCs) impact the early steps of intestinal tumorigenesis. Here, we show that a high fat diet (HFD) reduces the expression of the major histocompatibility complex II (MHC-II) genes in ISCs. This decline in ISC MHC-II expression in a HFD correlates with an altered intestinal microbiome composition and is recapitulated in antibiotic treated and germ-free mice on a control diet. Mechanistically, pattern recognition receptor and IFNg signaling regulate MHC-II expression in ISCs. Although MHC-II expression on ISCs is dispensable for stem cell function in organoid cultures in vitro, upon loss of the tumor suppressor gene Apc in a HFD, MHC-II- ISCs harbor greater in vivo tumor-initiating capacity than their MHC-II+ counterparts, thus implicating a role for epithelial MHC-II in suppressing tumorigenesis. Finally, ISC-specific genetic ablation of MHC-II in engineered Apc-mediated intestinal tumor models increases tumor burden in a cell autonomous manner. These findings highlight how a HFD alters the immune recognition properties of ISCs through the regulation of MHC-II expression in a manner that could contribute to intestinal tumorigenesis

    An improved zebrafish transcriptome annotation for sensitive and comprehensive detection of cell type-specific genes

    No full text
    The zebrafish is ideal for studying embryogenesis and is increasingly applied to model human disease. In these contexts, RNA-sequencing (RNA-seq) provides mechanistic insights by identifying transcriptome changes between experimental conditions. Application of RNA-seq relies on accurate transcript annotation for a genome of interest. Here, we find discrepancies in analysis from RNA-seq datasets quantified using Ensembl and RefSeq zebrafish annotations. These issues were due, in part, to variably annotated 3' untranslated regions and thousands of gene models missing from each annotation. Since these discrepancies could compromise downstream analyses and biological reproducibility, we built a more comprehensive zebrafish transcriptome annotation that addresses these deficiencies. Our annotation improves detection of cell type-specific genes in both bulk and single cell RNA-seq datasets, where it also improves resolution of cell clustering. Thus, we demonstrate that our new transcriptome annotation can outperform existing annotations, providing an important resource for zebrafish researchers

    The Cellular EJC Interactome Reveals Higher-Order mRNP Structure and an EJC-SR Protein Nexus

    Get PDF
    In addition to sculpting eukaryotic transcripts by removing introns, pre-mRNA splicing greatly impacts protein composition of the emerging mRNP. The exon junction complex (EJC), deposited upstream of exon-exon junctions after splicing, is a major constituent of spliced mRNPs. Here, we report comprehensive analysis of the endogenous human EJC protein and RNA interactomes. We confirm that the major canonical EJC occupancy site in vivo lies 24 nucleotides upstream of exon junctions and that the majority of exon junctions carry an EJC. Unexpectedly, we find that endogenous EJCs multimerize with one another and with numerous SR proteins to form megadalton sized complexes in which SR proteins are super-stoichiometric to EJC core factors. This tight physical association may explain known functional parallels between EJCs and SR proteins. Further, their protection of long mRNA stretches from nuclease digestion suggests that endogenous EJCs and SR proteins cooperate to promote mRNA packaging and compaction

    A Translational Model for Venous Thromboembolism: MicroRNA Expression in Hibernating Black Bears

    No full text
    BACKGROUND: Hibernating American black bears have significantly different clotting parameters than their summer active counterparts, affording them protection against venous thromboembolism during prolonged periods of immobility. We sought to evaluate if significant differences exist between the expression of microRNAs in the plasma of hibernating black bears compared with their summer active counterparts, potentially contributing to differences in hemostasis during hibernation. MATERIALS AND METHODS: MicroRNA sequencing was assessed in plasma from 21 American black bears in summer active (n = 11) and hibernating states (n = 10), and microRNA signatures during hibernating and active state were established using both bear and human genome. MicroRNA targets were predicted using messenger RNA (mRNA) transcripts from black bear kidney cells. In vitro studies were performed to confirm the relationship between identified microRNAs and mRNA expression, using artificial microRNA and human liver cells. RESULTS: Using the bear genome, we identified 15 microRNAs differentially expressed in the plasma of hibernating black bears. Of these microRNAs, three were significantly downregulated (miR-141-3p, miR-200a-3p, and miR-200c-3p), were predicted to target SERPINC1, the gene for antithrombin, and demonstrated regulatory control of the gene mRNA expression in cell studies. CONCLUSIONS: Our findings suggest that the hibernating black bears\u27 ability to maintain hemostasis and achieve protection from venous thromboembolism during prolonged periods of immobility may be due to changes in microRNA signatures and possible upregulation of antithrombin expression

    Bioinformatics Core Survey Highlights the Challenges Facing Data Analysis Facilities.

    No full text
    Over the last decade, the cost of -omics data creation has decreased 10-fold, whereas the need for analytical support for those data has increased exponentially. Consequently, bioinformaticians face a second wave of challenges: novel applications of existing approaches (e.g., single-cell RNA sequencing), integration of -omics data sets of differing size and scale (e.g., spatial transcriptomics), as well as novel computational and statistical methods, all of which require more sophisticated pipelines and data management. Nonetheless, bioinformatics cores are often asked to operate under primarily a cost-recovery model, with limited institutional support. Seeing the need to assess bioinformatics core operations, the Association of Biomolecular Resource Facilities Genomics Bioinformatics Research Group conducted a survey to answer questions about staffing, services, financial models, and challenges to better understand the challenges bioinformatics core facilities are currently faced with and will need to address going forward. Of the respondent groups, we chose to focus on the survey data from smaller cores, which made up the majority. Although all cores indicated similar challenges in terms of changing technologies and analysis needs, small cores tended to have the added challenge of funding their operations largely through cost-recovery models with heavy administrative burdens