16,837 research outputs found
Recommended from our members
Common DNA sequence variation influences 3-dimensional conformation of the human genome.
BACKGROUND:The 3-dimensional (3D) conformation of chromatin inside the nucleus is integral to a variety of nuclear processes including transcriptional regulation, DNA replication, and DNA damage repair. Aberrations in 3D chromatin conformation have been implicated in developmental abnormalities and cancer. Despite the importance of 3D chromatin conformation to cellular function and human health, little is known about how 3D chromatin conformation varies in the human population, or whether DNA sequence variation between individuals influences 3D chromatin conformation. RESULTS:To address these questions, we perform Hi-C on lymphoblastoid cell lines from 20 individuals. We identify thousands of regions across the genome where 3D chromatin conformation varies between individuals and find that this variation is often accompanied by variation in gene expression, histone modifications, and transcription factor binding. Moreover, we find that DNA sequence variation influences several features of 3D chromatin conformation including loop strength, contact insulation, contact directionality, and density of local cis contacts. We map hundreds of quantitative trait loci associated with 3D chromatin features and find evidence that some of these same variants are associated at modest levels with other molecular phenotypes as well as complex disease risk. CONCLUSION:Our results demonstrate that common DNA sequence variants can influence 3D chromatin conformation, pointing to a more pervasive role for 3D chromatin conformation in human phenotypic variation than previously recognized
Quantitative test of the barrier nucleosome model for statistical positioning of nucleosomes up- and downstream of transcription start sites
The positions of nucleosomes in eukaryotic genomes determine which parts of
the DNA sequence are readily accessible for regulatory proteins and which are
not. Genome-wide maps of nucleosome positions have revealed a salient pattern
around transcription start sites, involving a nucleosome-free region (NFR)
flanked by a pronounced periodic pattern in the average nucleosome density.
While the periodic pattern clearly reflects well-positioned nucleosomes, the
positioning mechanism is less clear. A recent experimental study by Mavrich et
al. argued that the pattern observed in S. cerevisiae is qualitatively
consistent with a `barrier nucleosome model', in which the oscillatory pattern
is created by the statistical positioning mechanism of Kornberg and Stryer. On
the other hand, there is clear evidence for intrinsic sequence preferences of
nucleosomes, and it is unclear to what extent these sequence preferences affect
the observed pattern. To test the barrier nucleosome model, we quantitatively
analyze yeast nucleosome positioning data both up- and downstream from NFRs.
Our analysis is based on the Tonks model of statistical physics which
quantifies the interplay between the excluded-volume interaction of nucleosomes
and their positional entropy. We find that although the typical patterns on the
two sides of the NFR are different, they are both quantitatively described by
the same physical model, with the same parameters, but different boundary
conditions. The inferred boundary conditions suggest that the first nucleosome
downstream from the NFR (the +1 nucleosome) is typically directly positioned
while the first nucleosome upstream is statistically positioned via a
nucleosome-repelling DNA region. These boundary conditions, which can be
locally encoded into the genome sequence, significantly shape the statistical
distribution of nucleosomes over a range of up to ~1000 bp to each side.Comment: includes supporting materia
Genome resequencing reveals multiscale geographic structure and extensive linkage disequilibrium in the forest tree Populus trichocarpa
This is the publisher’s final pdf. The article is copyrighted by the New Phytologist Trust and published by John Wiley & Sons, Inc. It can be found at: http://onlinelibrary.wiley.com/journal/10.1111/%28ISSN%291469-8137. To the best of our knowledge, one or more authors of this paper were federal employees when contributing to this work.•Plant population genomics informs evolutionary biology, breeding, conservation and bioenergy feedstock development. For example, the detection of reliable phenotype–genotype associations and molecular signatures of selection requires a detailed knowledge about genome-wide patterns of allele frequency variation, linkage disequilibrium and recombination.\ud
•We resequenced 16 genomes of the model tree Populus trichocarpa and genotyped 120 trees from 10 subpopulations using 29 213 single-nucleotide polymorphisms.\ud
•Significant geographic differentiation was present at multiple spatial scales, and range-wide latitudinal allele frequency gradients were strikingly common across the genome. The decay of linkage disequilibrium with physical distance was slower than expected from previous studies in Populus, with r² dropping below 0.2 within 3–6 kb. Consistent with this, estimates of recent effective population size from linkage disequilibrium (N[subscript e] ≈ 4000–6000) were remarkably low relative to the large census sizes of P. trichocarpa stands. Fine-scale rates of recombination varied widely across the genome, but were largely predictable on the basis of DNA sequence and methylation features.\ud
•Our results suggest that genetic drift has played a significant role in the recent evolutionary history of P. trichocarpa. Most importantly, the extensive linkage disequilibrium detected suggests that genome-wide association studies and genomic selection in undomesticated populations may be more feasible in Populus than previously assumed
Formation of regulatory modules by local sequence duplication
Turnover of regulatory sequence and function is an important part of
molecular evolution. But what are the modes of sequence evolution leading to
rapid formation and loss of regulatory sites? Here, we show that a large
fraction of neighboring transcription factor binding sites in the fly genome
have formed from a common sequence origin by local duplications. This mode of
evolution is found to produce regulatory information: duplications can seed new
sites in the neighborhood of existing sites. Duplicate seeds evolve
subsequently by point mutations, often towards binding a different factor than
their ancestral neighbor sites. These results are based on a statistical
analysis of 346 cis-regulatory modules in the Drosophila melanogaster genome,
and a comparison set of intergenic regulatory sequence in Saccharomyces
cerevisiae. In fly regulatory modules, pairs of binding sites show
significantly enhanced sequence similarity up to distances of about 50 bp. We
analyze these data in terms of an evolutionary model with two distinct modes of
site formation: (i) evolution from independent sequence origin and (ii)
divergent evolution following duplication of a common ancestor sequence. Our
results suggest that pervasive formation of binding sites by local sequence
duplications distinguishes the complex regulatory architecture of higher
eukaryotes from the simpler architecture of unicellular organisms
Functional Bias and Spatial Organization of Genes in Mutational Hot and Cold Regions in the Human Genome
The neutral mutation rate is known to vary widely along human chromosomes,
leading to mutational hot and cold regions. We provide evidence that categories
of functionally-related genes reside preferentially in mutationally hot or cold
regions, the size of which we have measured. Genes in hot regions are biased
toward extra-cellular communication (surface receptors, cell adhesion, immune
response, etc.) while those in cold regions are biased toward essential
cellular processes (gene regulation, RNA processing, protein modification,
etc.). From a selective perspective, this organization of genes could minimize
the mutational load on genes that need to be conserved and allow fast evolution
for genes that must frequently adapt. We also analyze the effect of gene
duplication and chromosomal recombination, which contribute significantly to
these biases for certain categories of hot genes. Overall, our results show
that genes are located non-randomly with respect to hot and cold regions,
offering the possibility that selection acts at the level of gene location in
the human genome.Comment: 17 pages, 6 figures, 2 tables. accepted to PLOS Biology, Feb. 2004
issu
Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases.
Inflammatory bowel diseases, which include Crohn's disease and ulcerative colitis, affect several million individuals worldwide. Crohn's disease and ulcerative colitis are complex diseases that are heterogeneous at the clinical, immunological, molecular, genetic, and microbial levels. Individual contributing factors have been the focus of extensive research. As part of the Integrative Human Microbiome Project (HMP2 or iHMP), we followed 132 subjects for one year each to generate integrated longitudinal molecular profiles of host and microbial activity during disease (up to 24 time points each; in total 2,965 stool, biopsy, and blood specimens). Here we present the results, which provide a comprehensive view of functional dysbiosis in the gut microbiome during inflammatory bowel disease activity. We demonstrate a characteristic increase in facultative anaerobes at the expense of obligate anaerobes, as well as molecular disruptions in microbial transcription (for example, among clostridia), metabolite pools (acylcarnitines, bile acids, and short-chain fatty acids), and levels of antibodies in host serum. Periods of disease activity were also marked by increases in temporal variability, with characteristic taxonomic, functional, and biochemical shifts. Finally, integrative analysis identified microbial, biochemical, and host factors central to this dysregulation. The study's infrastructure resources, results, and data, which are available through the Inflammatory Bowel Disease Multi'omics Database ( http://ibdmdb.org ), provide the most comprehensive description to date of host and microbial activities in inflammatory bowel diseases
Information content based model for the topological properties of the gene regulatory network of Escherichia coli
Gene regulatory networks (GRN) are being studied with increasingly precise
quantitative tools and can provide a testing ground for ideas regarding the
emergence and evolution of complex biological networks. We analyze the global
statistical properties of the transcriptional regulatory network of the
prokaryote Escherichia coli, identifying each operon with a node of the
network. We propose a null model for this network using the content-based
approach applied earlier to the eukaryote Saccharomyces cerevisiae. (Balcan et
al., 2007) Random sequences that represent promoter regions and binding
sequences are associated with the nodes. The length distributions of these
sequences are extracted from the relevant databases. The network is constructed
by testing for the occurrence of binding sequences within the promoter regions.
The ensemble of emergent networks yields an exponentially decaying in-degree
distribution and a putative power law dependence for the out-degree
distribution with a flat tail, in agreement with the data. The clustering
coefficient, degree-degree correlation, rich club coefficient and k-core
visualization all agree qualitatively with the empirical network to an extent
not yet achieved by any other computational model, to our knowledge. The
significant statistical differences can point the way to further research into
non-adaptive and adaptive processes in the evolution of the E. coli GRN.Comment: 58 pages, 3 tables, 22 figures. In press, Journal of Theoretical
Biology (2009)
- …