5,024 research outputs found
Structural features based genome-wide characterization and prediction of nucleosome organization
<p>Abstract</p> <p>Background</p> <p>Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in <it>S. cerevisiae</it>.</p> <p>Results</p> <p>We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions.</p> <p>Conclusions</p> <p>Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene expression regulation. The results indicated that our proposed methods are effective in predicting nucleosome occupancy and positions and that these structural features are highly predictive of nucleosome organization.</p> <p>The implementation of our DLaNe method based on structural features is available online.</p
Recommended from our members
The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization.
Sorghum bicolor is a drought tolerant C4 grass used for the production of grain, forage, sugar, and lignocellulosic biomass and a genetic model for C4 grasses due to its relatively small genome (approximately 800 Mbp), diploid genetics, diverse germplasm, and colinearity with other C4 grass genomes. In this study, deep sequencing, genetic linkage analysis, and transcriptome data were used to produce and annotate a high-quality reference genome sequence. Reference genome sequence order was improved, 29.6 Mbp of additional sequence was incorporated, the number of genes annotated increased 24% to 34 211, average gene length and N50 increased, and error frequency was reduced 10-fold to 1 per 100 kbp. Subtelomeric repeats with characteristics of Tandem Repeats in Miniature (TRIM) elements were identified at the termini of most chromosomes. Nucleosome occupancy predictions identified nucleosomes positioned immediately downstream of transcription start sites and at different densities across chromosomes. Alignment of more than 50 resequenced genomes from diverse sorghum genotypes to the reference genome identified approximately 7.4 M single nucleotide polymorphisms (SNPs) and 1.9 M indels. Large-scale variant features in euchromatin were identified with periodicities of approximately 25 kbp. A transcriptome atlas of gene expression was constructed from 47 RNA-seq profiles of growing and developed tissues of the major plant organs (roots, leaves, stems, panicles, and seed) collected during the juvenile, vegetative and reproductive phases. Analysis of the transcriptome data indicated that tissue type and protein kinase expression had large influences on transcriptional profile clustering. The updated assembly, annotation, and transcriptome data represent a resource for C4 grass research and crop improvement
Analysis of nucleosome positioning landscapes enables gene discovery in the human malaria parasite Plasmodium falciparum.
BackgroundPlasmodium falciparum, the deadliest malaria-causing parasite, has an extremely AT-rich (80.7 %) genome. Because of high AT-content, sequence-based annotation of genes and functional elements remains challenging. In order to better understand the regulatory network controlling gene expression in the parasite, a more complete genome annotation as well as analysis tools adapted for AT-rich genomes are needed. Recent studies on genome-wide nucleosome positioning in eukaryotes have shown that nucleosome landscapes exhibit regular characteristic patterns at the 5'- and 3'-end of protein and non-protein coding genes. In addition, nucleosome depleted regions can be found near transcription start sites. These unique nucleosome landscape patterns may be exploited for the identification of novel genes. In this paper, we propose a computational approach to discover novel putative genes based exclusively on nucleosome positioning data in the AT-rich genome of P. falciparum.ResultsUsing binary classifiers trained on nucleosome landscapes at the gene boundaries from two independent nucleosome positioning data sets, we were able to detect a total of 231 regions containing putative genes in the genome of Plasmodium falciparum, of which 67 highly confident genes were found in both data sets. Eighty-eight of these 231 newly predicted genes exhibited transcription signal in RNA-Seq data, indicative of active transcription. In addition, 20 out of 21 selected gene candidates were further validated by RT-PCR, and 28 out of the 231 genes showed significant matches using BLASTN against an expressed sequence tag (EST) database. Furthermore, 108 (47%) out of the 231 putative novel genes overlapped with previously identified but unannotated long non-coding RNAs. Collectively, these results provide experimental validation for 163 predicted genes (70.6%). Finally, 73 out of 231 genes were found to be potentially translated based on their signal in polysome-associated RNA-Seq representing transcripts that are actively being translated.ConclusionOur results clearly indicate that nucleosome positioning data contains sufficient information for novel gene discovery. As distinct nucleosome landscapes around genes are found in many other eukaryotic organisms, this methodology could be used to characterize the transcriptome of any organism, especially when coupled with other DNA-based gene finding and experimental methods (e.g., RNA-Seq)
Epigenomes in Cardiovascular Disease.
If unifying principles could be revealed for how the same genome encodes different eukaryotic cells and for how genetic variability and environmental input are integrated to impact cardiovascular health, grand challenges in basic cell biology and translational medicine may succumb to experimental dissection. A rich body of work in model systems has implicated chromatin-modifying enzymes, DNA methylation, noncoding RNAs, and other transcriptome-shaping factors in adult health and in the development, progression, and mitigation of cardiovascular disease. Meanwhile, deployment of epigenomic tools, powered by next-generation sequencing technologies in cardiovascular models and human populations, has enabled description of epigenomic landscapes underpinning cellular function in the cardiovascular system. This essay aims to unpack the conceptual framework in which epigenomes are studied and to stimulate discussion on how principles of chromatin function may inform investigations of cardiovascular disease and the development of new therapies
Architecture of the chromatin remodeler RSC and insights into its nucleosome engagement.
Eukaryotic DNA is packaged into nucleosome arrays, which are repositioned by chromatin remodeling complexes to control DNA accessibility. The Saccharomyces cerevisiae RSC (Remodeling the Structure of Chromatin) complex, a member of the SWI/SNF chromatin remodeler family, plays critical roles in genome maintenance, transcription, and DNA repair. Here, we report cryo-electron microscopy (cryo-EM) and crosslinking mass spectrometry (CLMS) studies of yeast RSC complex and show that RSC is composed of a rigid tripartite core and two flexible lobes. The core structure is scaffolded by an asymmetric Rsc8 dimer and built with the evolutionarily conserved subunits Sfh1, Rsc6, Rsc9 and Sth1. The flexible ATPase lobe, composed of helicase subunit Sth1, Arp7, Arp9 and Rtt102, is anchored to this core by the N-terminus of Sth1. Our cryo-EM analysis of RSC bound to a nucleosome core particle shows that in addition to the expected nucleosome-Sth1 interactions, RSC engages histones and nucleosomal DNA through one arm of the core structure, composed of the Rsc8 SWIRM domains, Sfh1 and Npl6. Our findings provide structural insights into the conserved assembly process for all members of the SWI/SNF family of remodelers, and illustrate how RSC selects, engages, and remodels nucleosomes
Predicting gene expression in the human malaria parasite Plasmodium falciparum using histone modification, nucleosome positioning, and 3D localization features.
Empirical evidence suggests that the malaria parasite Plasmodium falciparum employs a broad range of mechanisms to regulate gene transcription throughout the organism's complex life cycle. To better understand this regulatory machinery, we assembled a rich collection of genomic and epigenomic data sets, including information about transcription factor (TF) binding motifs, patterns of covalent histone modifications, nucleosome occupancy, GC content, and global 3D genome architecture. We used these data to train machine learning models to discriminate between high-expression and low-expression genes, focusing on three distinct stages of the red blood cell phase of the Plasmodium life cycle. Our results highlight the importance of histone modifications and 3D chromatin architecture in Plasmodium transcriptional regulation and suggest that AP2 transcription factors may play a limited regulatory role, perhaps operating in conjunction with epigenetic factors
Modulation of Gene Expression by Gene Architecture and Promoter Structure
Regulation of gene expression is achieved by the presence of cis regulatory elements; these signatures are interspersed in the noncoding region and also situated in the coding region of the genome. These elements orchestrate the gene expression process by regulating the different steps involved in the flow of genetic information. Transcription (DNA to RNA) and translation (RNA to Protein) are controlled at different levels by different regulatory elements present in the genome. Current chapter describes the structural and functional elements present in the coding and noncoding region of the genome. Further we discuss role of regulatory elements in regulation of gene expression in prokaryotes and eukaryotes. Finally, we also discuss DNA structural properties of regulatory regions and their role in gene expression. Identification and characterization of cis regulatory elements would be useful to engineer the regulation of gene expression
- …