176 research outputs found

    Globin genes on the move

    Get PDF
    Recent data published in BMC Biology from the globin gene clusters in platypus, together with data from other species, show that β-globin genes transposed from one chromosomal location to another. This resolves some controversies about vertebrate globin gene evolution but ignites new ones

    Human-macaque comparisons illuminate variation in neutral substitution rates

    Get PDF
    The evolutionary distance between human and macaque is particularly attractive for investigating neutral substitution rates, which were calculated as a function of a number of genomic parameters

    Improvements to GALA and dbERGE II: databases featuring genomic sequence alignment, annotation and experimental results

    Get PDF
    We describe improvements to two databases that give access to information on genomic sequence similarities, functional elements in DNA and experimental results that demonstrate those functions. GALA, the database of Genome ALignments and Annotations, is now a set of interlinked relational databases for five vertebrate species, human, chimpanzee, mouse, rat and chicken. For each species, GALA records pairwise and multiple sequence alignments, scores derived from those alignments that reflect the likelihood of being under purifying selection or being a regulatory element, and extensive annotations such as genes, gene expression patterns and transcription factor binding sites. The user interface supports simple and complex queries, including operations such as subtraction and intersections as well as clustering and finding elements in proximity to features. dbERGE II, the database of Experimental Results on Gene Expression, contains experimental data from a variety of functional assays. Both databases are now run on the DB2 database management system. Improved hardware and tuning has reduced response times and increased querying capacity, while simplified query interfaces will help direct new users through the querying process. Links are available at http://www.bx.psu.edu/

    Integrating and mining the chromatin landscape of cell-type specificity using self-organizing maps

    Get PDF
    We tested whether self-organizing maps (SOMs) could be used to effectively integrate, visualize, and mine diverse genomics data types, including complex chromatin signatures. A fine-grained SOM was trained on 72 ChIP-seq histone modifications and DNase-seq data sets from six biologically diverse cell lines studied by The ENCODE Project Consortium. We mined the resulting SOM to identify chromatin signatures related to sequence-specific transcription factor occupancy, sequence motif enrichment, and biological functions. To highlight clusters enriched for specific functions such as transcriptional promoters or enhancers, we overlaid onto the map additional data sets not used during training, such as ChIP-seq, RNA-seq, CAGE, and information on cis-acting regulatory modules from the literature. We used the SOM to parse known transcriptional enhancers according to the cell-type-specific chromatin signature, and we further corroborated this pattern on the map by EP300 (also known as p300) occupancy. New candidate cell-type-specific enhancers were identified for multiple ENCODE cell types in this way, along with new candidates for ubiquitous enhancer activity. An interactive web interface was developed to allow users to visualize and custom-mine the ENCODE SOM. We conclude that large SOMs trained on chromatin data from multiple cell types provide a powerful way to identify complex relationships in genomic data at user-selected levels of granularity

    Revealing mammalian evolutionary relationships by comparative analysis of gene clusters

    Get PDF
    Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is orthology. Orthologs derive from a common ancestor by speciation, in contrast to paralogs, which derive from duplication. Discriminating orthologs from paralogs is a necessary step in most multispecies sequence analyses, but doing so accurately is impeded by the occurrence of gene conversion events. We propose a refined method of orthology assignment based on two paradigms for interpreting its definition: by genomic context or by sequence content. X-orthology (based on context) traces orthology resulting from speciation and duplication only, while N-orthology (based on content) includes the influence of conversion events

    Defining functional DNA elements in the human genome

    Get PDF
    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease

    A comprehensive and high-resolution genome-wide response of p53 to stress

    Get PDF
    Tumor suppressor p53 regulates transcription of stress-response genes. Many p53 targets remain undiscovered because of uncertainty as to where p53 binds in the genome and the fact that few genes reside near p53-bound recognition elements (REs). Using chromatin immunoprecipitation followed by exonuclease treatment (ChIP-exo), we associated p53 with 2,183 unsplit REs. REs were positionally constrained with other REs and other regulatory elements, which may reflect structurally organized p53 interactions. Surprisingly, stress resulted in increased occupancy of transcription factor IIB (TFIIB) and RNA polymerase (Pol) II near REs, which was reduced when p53 was present. A subset associated with antisense RNA near stress-response genes. The combination of high-confidence locations for p53/REs, TFIIB/Pol II, and their changes in response to stress allowed us to identify 151 high-confidence p53-regulated genes, substantially increasing the number of p53 targets. These genes composed a large portion of a predefined DNA-damage stress-response network. Thus, p53 plays a comprehensive role in regulating the stress-response network, including regulating noncoding transcription

    Pluripotent stem cells reveal erythroid-specific activities of the GATA1 N-terminus

    Get PDF
    Germline GATA1 mutations that result in the production of an amino-truncated protein termed GATA1s (where s indicates short) cause congenital hypoplastic anemia. In patients with trisomy 21, similar somatic GATA1s-producing mutations promote transient myeloproliferative disease and acute megakaryoblastic leukemia. Here, we demonstrate that induced pluripotent stem cells (iPSCs) from patients with GATA1-truncating mutations exhibit impaired erythroid potential, but enhanced megakaryopoiesis and myelopoiesis, recapitulating the major phenotypes of the associated diseases. Similarly, in developmentally arrested GATA1-deficient murine megakaryocyte-erythroid progenitors derived from murine embryonic stem cells (ESCs), expression of GATA1s promoted megakaryopoiesis, but not erythropoiesis. Transcriptome analysis revealed a selective deficiency in the ability of GATA1s to activate erythroid-specific genes within populations of hematopoietic progenitors. Although its DNA-binding domain was intact, chromatin immunoprecipitation studies showed that GATA1s binding at specific erythroid regulatory regions was impaired, while binding at many nonerythroid sites, including megakaryocytic and myeloid target genes, was normal. Together, these observations indicate that lineage-specific GATA1 cofactor associations are essential for normal chromatin occupancy and provide mechanistic insights into how GATA1s mutations cause human disease. More broadly, our studies underscore the value of ESCs and iPSCs to recapitulate and study disease phenotypes12539931005United States Department of Health & Human Services; National Institutes of Health (NIH) - USA; American Society of Hematology Scholar Award; Alex's Lemonade Stand Foundation Springboard Grant; NIH Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD); NIH National Heart Lung & Blood Institute (NHLBI); NIH National Institute of Diabetes & Digestive & Kidney Diseases (NIDDK

    Conversion events in gene clusters

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene clusters containing multiple similar genomic regions in close proximity are of great interest for biomedical studies because of their associations with inherited diseases. However, such regions are difficult to analyze due to their structural complexity and their complicated evolutionary histories, reflecting a variety of large-scale mutational events. In particular, conversion events can mislead inferences about the relationships among these regions, as traced by traditional methods such as construction of phylogenetic trees or multi-species alignments.</p> <p>Results</p> <p>To correct the distorted information generated by such methods, we have developed an automated pipeline called CHAP (Cluster History Analysis Package) for detecting conversion events. We used this pipeline to analyze the conversion events that affected two well-studied gene clusters (α-globin and β-globin) and three gene clusters for which comparative sequence data were generated from seven primate species: CCL (chemokine ligand), IFN (interferon), and CYP2abf (part of cytochrome P450 family 2). CHAP is freely available at <url>http://www.bx.psu.edu/miller_lab</url>.</p> <p>Conclusions</p> <p>These studies reveal the value of characterizing conversion events in the context of studying gene clusters in complex genomes.</p
    corecore