130 research outputs found

    Integrating motif, DNA accessibility and gene expression data to build regulatory maps in an organism

    Get PDF
    Characterization of cell type specific regulatory networks and elements is a major challenge in genomics, and emerging strategies frequently employ high-throughput genome-wide assays of transcription factor (TF) to DNA binding, histone modifications or chromatin state. However, these experiments remain too difficult/expensive for many laboratories to apply comprehensively to their system of interest. Here, we explore the potential of elucidating regulatory systems in varied cell types using computational techniques that rely on only data of gene expression, low-resolution chromatin accessibility, and TF-DNA binding specificities (\u27motifs\u27). We show that static computational motif scans overlaid with chromatin accessibility data reasonably approximate experimentally measured TF-DNA binding. We demonstrate that predicted binding profiles and expression patterns of hundreds of TFs are sufficient to identify major regulators of approximately 200 spatiotemporal expression domains in the Drosophila embryo. We are then able to learn reliable statistical models of enhancer activity for over 70 expression domains and apply those models to annotate domain specific enhancers genome-wide. Throughout this work, we apply our motif and accessibility based approach to comprehensively characterize the regulatory network of fruitfly embryonic development and show that the accuracy of our computational method compares favorably to approaches that rely on data from many experimental assays. Acids Research

    Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development

    Get PDF
    Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein-protein binding experiments revealed that \u3e65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action

    Targeted germ line disruptions reveal general and species-specific roles for paralog group 1 hox genes in zebrafish

    Get PDF
    BACKGROUND: The developing vertebrate hindbrain is transiently segmented into rhombomeres by a process requiring Hox activity. Hox genes control specification of rhombomere fates, as well as the stereotypic differentiation of rhombomere-specific neuronal populations. Accordingly, germ line disruption of the paralog group 1 (PG1) Hox genes Hoxa1 and Hoxb1 causes defects in hindbrain segmentation and neuron formation in mice. However, antisense-mediated interference with zebrafish hoxb1a and hoxb1b (analogous to murine Hoxb1 and Hoxa1, respectively) produces phenotypes that are qualitatively and quantitatively distinct from those observed in the mouse. This suggests that PG1 Hox genes may have species-specific functions, or that anti-sense mediated interference may not completely inactivate Hox function in zebrafish. RESULTS: Using zinc finger and TALEN technologies, we disrupted hoxb1a and hoxb1b in the zebrafish germ line to establish mutant lines for each gene. We find that zebrafish hoxb1a germ line mutants have a more severe phenotype than reported for Hoxb1a antisense treatment. This phenotype is similar to that observed in Hoxb1 knock out mice, suggesting that Hoxb1/hoxb1a have the same function in both species. Zebrafish hoxb1b germ line mutants also have a more severe phenotype than reported for hoxb1b antisense treatment (e.g. in the effect on Mauthner neuron differentiation), but this phenotype differs from that observed in Hoxa1 knock out mice (e.g. in the specification of rhombomere 5 (r5) and r6), suggesting that Hoxa1/hoxb1b have species-specific activities. We also demonstrate that Hoxb1b regulates nucleosome organization at the hoxb1a promoter and that retinoic acid acts independently of hoxb1b to activate hoxb1a expression. CONCLUSIONS: We generated several novel germ line mutants for zebrafish hoxb1a and hoxb1b. Our analyses indicate that Hoxb1 and hoxb1a have comparable functions in zebrafish and mouse, suggesting a conserved function for these genes. In contrast, while Hoxa1 and hoxb1b share functions in the formation of r3 and r4, they differ with regards to r5 and r6, where Hoxa1 appears to control formation of r5, but not r6, in the mouse, whereas hoxb1b regulates formation of r6, but not r5, in zebrafish. Lastly, our data reveal independent regulation of hoxb1a expression by retinoic acid and Hoxb1b in zebrafish

    Efficient targeted mutagenesis in the monarch butterfly using zinc finger nucleases

    Get PDF
    The development of reverse-genetic tools in non-model insect species with distinct biology is critical to establish them as viable model systems. The eastern North American monarch butterfly (Danaus plexippus), whose genome is sequenced, has emerged as a model to study animal clocks, navigational mechanisms and the genetic basis of long-distance migration. Here, we developed a highly efficient gene-targeting approach in the monarch using zinc-finger nucleases (ZFNs), engineered nucleases that generate mutations at targeted genomic sequences. We focused our ZFN approach on targeting the type 2 vertebrate-like cryptochrome gene of the monarch (designated cry2), which encodes a putative transcriptional repressor of the monarch circadian clockwork. Co-injections of mRNAs encoding ZFNs targeting the second exon of monarch cry2 into one nucleus stage embryos led to high frequency non-homologous end-joining-mediated, mutagenic lesions in the germline (up to 50%). Heritable ZFN-induced lesions in two independent lines produced truncated, nonfunctional CRY2 proteins, resulting in the in vivo disruption of circadian behavior and the molecular clock mechanism. Our work genetically defines CRY2 as an essential transcriptional repressor of the monarch circadian clock and provides a proof of concept for the use of ZFNs for manipulating genes in the monarch butterfly genome. Importantly, this approach could be used in other lepidopterans and non-model insects, thus opening new avenues to decipher the molecular underpinnings of a variety of biological processes

    Genome editing of HBG1 and HBG2 to induce fetal hemoglobin

    Get PDF
    Induction of fetal hemoglobin (HbF) via clustered regularly interspaced short palindromic repeats/Cas9-mediated disruption of DNA regulatory elements that repress gamma-globin gene (HBG1 and HBG2) expression is a promising therapeutic strategy for sickle cell disease (SCD) and beta-thalassemia, although the optimal technical approaches and limiting toxicities are not yet fully defined. We disrupted an HBG1/HBG2 gene promoter motif that is bound by the transcriptional repressor BCL11A. Electroporation of Cas9 single guide RNA ribonucleoprotein complex into normal and SCD donor CD34+ hematopoietic stem and progenitor cells resulted in high frequencies of on-target mutations and the induction of HbF to potentially therapeutic levels in erythroid progeny generated in vitro and in vivo after transplantation of hematopoietic stem and progenitor cells into nonobese diabetic/severe combined immunodeficiency/Il2rgamma-/-/KitW41/W41 immunodeficient mice. On-target editing did not impair CD34+ cell regeneration or differentiation into erythroid, T, B, or myeloid cell lineages at 16 to 17 weeks after xenotransplantation. No off-target mutations were detected by targeted sequencing of candidate sites identified by circularization for in vitro reporting of cleavage effects by sequencing (CIRCLE-seq), an in vitro genome-scale method for detecting Cas9 activity. Engineered Cas9 containing 3 nuclear localization sequences edited human hematopoietic stem and progenitor cells more efficiently and consistently than conventional Cas9 with 2 nuclear localization sequences. Our studies provide novel and essential preclinical evidence supporting the safety, feasibility, and efficacy of a mechanism-based approach to induce HbF for treating hemoglobinopathies

    Multicolor CRISPR labeling of chromosomal loci in human cells

    Get PDF
    The intranuclear location of genomic loci and the dynamics of these loci are important parameters for understanding the spatial and temporal regulation of gene expression. Recently it has proven possible to visualize endogenous genomic loci in live cells by the use of transcription activator-like effectors (TALEs), as well as modified versions of the bacterial immunity clustered regularly interspersed short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system. Here we report the design of multicolor versions of CRISPR using catalytically inactive Cas9 endonuclease (dCas9) from three bacterial orthologs. Each pair of dCas9-fluorescent proteins and cognate single-guide RNAs (sgRNAs) efficiently labeled several target loci in live human cells. Using pairs of differently colored dCas9-sgRNAs, it was possible to determine the intranuclear distance between loci on different chromosomes. In addition, the fluorescence spatial resolution between two loci on the same chromosome could be determined and related to the linear distance between them on the chromosome\u27s physical map, thereby permitting assessment of the DNA compaction of such regions in a live cell

    Counter-selectable marker for bacterial-based interaction trap systems

    Get PDF
    Counter-selectable markers can be used in two-hybrid systems to search libraries for a protein or compound that interferes with a macromolecular interaction or to identify macromolecules from a population that cannot mediate a particular interaction. In this report, we describe the adaptation of the yeast URA3/5-FOA counter-selection system for use in bacterial interaction trap experiments. Two different URA3 reporter systems were developed that allow robust counter-selection: (i) a single copy F\u27 episome reporter and (ii) a co-cistronic HIS3-URA3 reporter vector. The HIS3-URA3 reporter can be used for either positive or negative selections in appropriate bacterial strains. These reagents extend the utility of the bacterial two-hybrid system as an alternative to its yeast-based counterpart

    GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases

    Get PDF
    BACKGROUND: Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed. RESULTS: Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction algorithm permitting the identification of genomic sequences with unexpected cleavage activity. CONCLUSION: The GUIDEseq package enables analysis of GUIDE-data from various nuclease platforms for any species with a defined genomic sequence. This software package has been used successfully to analyze several GUIDE-seq datasets. The software, source code and documentation are freely available at http://www.bioconductor.org/packages/release/bioc/html/GUIDEseq.html

    Targeted chromosomal deletions and inversions in zebrafish

    Get PDF
    Zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) provide powerful platforms for genome editing in plants and animals. Typically, a single nuclease is sufficient to disrupt the function of protein-coding genes through the introduction of microdeletions or insertions that cause frameshifts within an early coding exon. However, interrogating the function of cis-regulatory modules or noncoding RNAs in many instances requires the excision of this element from the genome. In human cell lines and invertebrates, two nucleases targeting the same chromosome can promote the deletion of intervening genomic segments with modest efficiencies. We have examined the feasibility of using this approach to delete chromosomal segments within the zebrafish genome, which would facilitate the functional study of large noncoding sequences in a vertebrate model of development. Herein, we demonstrate that segmental deletions within the zebrafish genome can be generated at multiple loci and are efficiently transmitted through the germline. Using two nucleases, we have successfully generated deletions of up to 69 kb at rates sufficient for germline transmission (1%-15%) and have excised an entire lincRNA gene and enhancer element. Larger deletions (5.5 Mb) can be generated in somatic cells, but at lower frequency (0.7%). Segmental inversions have also been generated, but the efficiency of these events is lower than the corresponding deletions. The ability to efficiently delete genomic segments in a vertebrate developmental system will facilitate the study of functional noncoding elements on an organismic level

    Exploring the DNA-recognition potential of homeodomains

    Get PDF
    The recognition potential of most families of DNA-binding domains (DBDs) remains relatively unexplored. Homeodomains (HDs), like many other families of DBDs, display limited diversity in their preferred recognition sequences. To explore the recognition potential of HDs, we utilized a bacterial selection system to isolate HD variants, from a randomized library, that are compatible with each of the 64 possible 3′ triplet sites (i.e., TAANNN). The majority of these selections yielded sets of HDs with overrepresented residues at specific recognition positions, implying the selection of specific binders. The DNA-binding specificity of 151 representative HD variants was subsequently characterized, identifying HDs that preferentially recognize 44 of these target sites. Many of these variants contain novel combinations of specificity determinants that are uncommon or absent in extant HDs. These novel determinants, when grafted into different HD backbones, produce a corresponding alteration in specificity. This information was used to create more explicit HD recognition models, which can inform the prediction of transcriptional regulatory networks for extant HDs or the engineering of HDs with novel DNA-recognition potential. The diversity of recovered HD recognition sequences raises important questions about the fitness barrier that restricts the evolution of alternate recognition modalities in natural systems
    • …
    corecore