2,536 research outputs found

    Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses

    Get PDF
    Analysis of 280 experimentally-verified cis-regulatory modules from Drosophila reveal features both common to all and unique to distinct subclasses of modules

    Computational Characterization of Genome-wide DNA-binding Pro les

    Get PDF
    The work and data that is presented in this thesis is part of a collaborative project that is funded by the Berlin Center for Regenerative Therapies. A number of people have contributed to this work and for clarity I will now mention the individual contributions. Stefan Mundlos, Peter N. Robinson and Jochen Hecht designed this project with the purpose of studying bone development using ChIP-seq in a chicken model. Jochen Hecht and Asita Stiege established the ChIP-seq protocol and together with Daniel Ibrahim, Hendrikje Hein, and Catrin Janetzky carried out the immunoprecipitations and sequencing. Peter Krawitz was responsible for the data processing that involved base calling and basic quality control. Daniel Ibrahim contributed to the analysis on the Hox proteins identifying the Q317K mutant to be related to Pitx1 and Obox family members. Sebastian Kohler and Sebastian Bauer carried out the computation of the Gene Ontology similarity data and random walk distances that I used for the target gene assignments in chapter 5. The results for the EMSA experiments that are shown in chapter three has been carried out by Asita Stiege. The work on target gene assignment that is presented in chapter 5 has been published in Nucleic Acids Research [1]. All the remaining methods, data and the experimental results will be partially be included in future publications by Ibrahim et al. and Hein et al.

    Genomic Variation and Its Impact on Gene Expression in Drosophila melanogaster

    Get PDF
    Understanding the relationship between genetic and phenotypic variation is one of the great outstanding challenges in biology. To meet this challenge, comprehensive genomic variation maps of human as well as of model organism populations are required. Here, we present a nucleotide resolution catalog of single-nucleotide, multi-nucleotide, and structural variants in 39 Drosophila melanogaster Genetic Reference Panel inbred lines. Using an integrative, local assembly-based approach for variant discovery, we identify more than 3.6 million distinct variants, among which were more than 800,000 unique insertions, deletions (indels), and complex variants (1 to 6,000 bp). While the SNP density is higher near other variants, we find that variants themselves are not mutagenic, nor are regions with high variant density particularly mutation-prone. Rather, our data suggest that the elevated SNP density around variants is mainly due to population-level processes. We also provide insights into the regulatory architecture of gene expression variation in adult flies by mapping cis-expression quantitative trait loci (cis-eQTLs) for more than 2,000 genes. Indels comprise around 10% of all cis-eQTLs and show larger effects than SNP cis-eQTLs. In addition, we identified two-fold more gene associations in males as compared to females and found that most cis-eQTLs are sex-specific, revealing a partial decoupling of the genomic architecture between the sexes as well as the importance of genetic factors in mediating sex-biased gene expression. Finally, we performed RNA-seq-based allelic expression imbalance analyses in the offspring of crosses between sequenced lines, which revealed that the majority of strong cis-eQTLs can be validated in heterozygous individuals

    A neural network based model effectively predicts enhancers from clinical ATAC-seq samples.

    Get PDF
    Enhancers are cis-acting sequences that regulate transcription rates of their target genes in a cell-specific manner and harbor disease-associated sequence variants in cognate cell types. Many complex diseases are associated with enhancer malfunction, necessitating the discovery and study of enhancers from clinical samples. Assay for Transposase Accessible Chromatin (ATAC-seq) technology can interrogate chromatin accessibility from small cell numbers and facilitate studying enhancers in pathologies. However, on average, ~35% of open chromatin regions (OCRs) from ATAC-seq samples map to enhancers. We developed a neural network-based model, Predicting Enhancers from ATAC-Seq data (PEAS), to effectively infer enhancers from clinical ATAC-seq samples by extracting ATAC-seq data features and integrating these with sequence-related features (e.g., GC ratio). PEAS recapitulated ChromHMM-defined enhancers in CD14+ monocytes, CD4+ T cells, GM12878, peripheral blood mononuclear cells, and pancreatic islets. PEAS models trained on these 5 cell types effectively predicted enhancers in four cell types that are not used in model training (EndoC-βH1, naïve CD8+ T, MCF7, and K562 cells). Finally, PEAS inferred individual-specific enhancers from 19 islet ATAC-seq samples and revealed variability in enhancer activity across individuals, including those driven by genetic differences. PEAS is an easy-to-use tool developed to study enhancers in pathologies by taking advantage of the increasing number of clinical epigenomes

    Genetic Analysis of the Neurosteroid Deoxycorticosterone and Its Relation to Alcohol Phenotypes: Identification of QTLs and Downstream Gene Regulation

    Get PDF
    Deoxycorticosterone (DOC) is an endogenous neurosteroid found in brain and serum, precursor of the GABAergic neuroactive steroid (3α,5α)-3,21-dihydroxypregnan-20-one (tetrahydrodeoxycorticosterone, THDOC) and the glucocorticoid corticosterone. These steroids are elevated following stress or ethanol administration, contribute to ethanol sensitivity, and their elevation is blunted in ethanol dependence.To systematically define the genetic basis, regulation, and behavioral significance of DOC levels in plasma and cerebral cortex we examined such levels across 47 young adult males from C57BL/6J (B6)×DBA/2J (D2) (BXD) mouse strains for quantitative trait loci (QTL) and bioinformatics analyses of behavior and gene regulation. Mice were injected with saline or 0.075 mg/kg dexamethasone sodium salt at 8:00 am and were sacrificed 6 hours later. DOC levels were measured by radioimmunoassay. Basal cerebral cortical DOC levels ranged between 1.4 and 12.2 ng/g (8.7-fold variation, p<0.0001) with a heritability of ∼0.37. Basal plasma DOC levels ranged between 2.8 and 12.1 ng/ml (4.3-fold variation, p<0.0001) with heritability of ∼0.32. QTLs for basal DOC levels were identified on chromosomes 4 (cerebral cortex) and 14 (plasma). Dexamethasone-induced changes in DOC levels showed a 4.4-fold variation in cerebral cortex and a 4.1-fold variation in plasma, but no QTLs were identified. DOC levels across BXD strains were further shown to be co-regulated with networks of genes linked to neuronal, immune, and endocrine function. DOC levels and its responses to dexamethasone were associated with several behavioral measures of ethanol sensitivity previously determined across the BXD strains by multiple laboratories.Both basal and dexamethasone-suppressed DOC levels are positively correlated with ethanol sensitivity suggesting that the neurosteroid DOC may be a putative biomarker of alcohol phenotypes. DOC levels were also strongly correlated with networks of genes associated with neuronal function, innate immune pathways, and steroid metabolism, likely linked to behavioral phenotypes

    Computational identification of tissue-specific alternative splicing elements in mouse genes from RNA-Seq

    Get PDF
    Tissue-specific alternative splicing is a key mechanism for generating tissue-specific proteomic diversity in eukaryotes. Splicing regulatory elements (SREs) in pre-mature messenger RNA play a very important role in regulating alternative splicing. In this article, we use mouse RNA-Seq data to determine a positive data set where SREs are over-represented and a reliable negative data set where the same SREs are most likely under-represented for a specific tissue and then employ a powerful discriminative approach to identify SREs. We identified 456 putative splicing enhancers or silencers, of which 221 were predicted to be tissue-specific. Most of our tissue-specific SREs are likely different from constitutive SREs, since only 18% of our exonic splicing enhancers (ESEs) are contained in constitutive RESCUE-ESEs. A relatively small portion (20%) of our SREs is included in tissue-specific SREs in human identified in two recent studies. In the analysis of position distribution of SREs, we found that a dozen of SREs were biased to a specific region. We also identified two very interesting SREs that can function as an enhancer in one tissue but a silencer in another tissue from the same intronic region. These findings provide insight into the mechanism of tissue-specific alternative splicing and give a set of valuable putative SREs for further experimental investigations

    Gene expression profiling in C57BL/6J and A/J mouse inbred strains reveals gene networks specific for brain regions independent of genetic background

    Get PDF
    Abstract Background We performed gene expression profiling of the amygdala and hippocampus taken from inbred mouse strains C57BL/6J and A/J. The selected brain areas are implicated in neurobehavioral traits while these mouse strains are known to differ widely in behavior. Consequently, we hypothesized that comparing gene expression profiles for specific brain regions in these strains might provide insight into the molecular mechanisms of human neuropsychiatric traits. We performed a whole-genome gene expression experiment and applied a systems biology approach using weighted gene co-expression network analysis. Results We were able to identify modules of co-expressed genes that distinguish a strain or brain region. Analysis of the networks that are most informative for hippocampus and amygdala revealed enrichment in neurologically, genetically and psychologically related pathways. Close examination of the strain-specific gene expression profiles, however, revealed no functional relevance but a significant enrichment of single nucleotide polymorphisms in the probe sequences used for array hybridization. This artifact was not observed for the modules of co-expressed genes that distinguish amygdala and hippocampus. Conclusions The brain-region specific modules were found to be independent of genetic background and are therefore likely to represent biologically relevant molecular networks that can be studied to complement our knowledge about pathways in neuropsychiatric disease

    Identifying noncoding risk variants using disease-relevant gene regulatory networks.

    Get PDF
    Identifying noncoding risk variants remains a challenging task. Because noncoding variants exert their effects in the context of a gene regulatory network (GRN), we hypothesize that explicit use of disease-relevant GRNs can significantly improve the inference accuracy of noncoding risk variants. We describe Annotation of Regulatory Variants using Integrated Networks (ARVIN), a general computational framework for predicting causal noncoding variants. It employs a set of novel regulatory network-based features, combined with sequence-based features to infer noncoding risk variants. Using known causal variants in gene promoters and enhancers in a number of diseases, we show ARVIN outperforms state-of-the-art methods that use sequence-based features alone. Additional experimental validation using reporter assay further demonstrates the accuracy of ARVIN. Application of ARVIN to seven autoimmune diseases provides a holistic view of the gene subnetwork perturbed by the combinatorial action of the entire set of risk noncoding mutations. Nat Commun 2018 Feb 16; 9(1):702
    corecore