76 research outputs found

    Complex exon-intron marking by histone modifications is not determined solely by nucleosome distribution

    Get PDF
    It has recently been shown that nucleosome distribution, histone modifications and RNA polymerase II (Pol II) occupancy show preferential association with exons (“exon-intron marking”), linking chromatin structure and function to co-transcriptional splicing in a variety of eukaryotes. Previous ChIP-sequencing studies suggested that these marking patterns reflect the nucleosomal landscape. By analyzing ChIP-chip datasets across the human genome in three cell types, we have found that this marking system is far more complex than previously observed. We show here that a range of histone modifications and Pol II are preferentially associated with exons. However, there is noticeable cell-type specificity in the degree of exon marking by histone modifications and, surprisingly, this is also reflected in some histone modifications patterns showing biases towards introns. Exon-intron marking is laid down in the absence of transcription on silent genes, with some marking biases changing or becoming reversed for genes expressed at different levels. Furthermore, the relationship of this marking system with splicing is not simple, with only some histone modifications reflecting exon usage/inclusion, while others mirror patterns of exon exclusion. By examining nucleosomal distributions in all three cell types, we demonstrate that these histone modification patterns cannot solely be accounted for by differences in nucleosome levels between exons and introns. In addition, because of inherent differences between ChIP-chip array and ChIP-sequencing approaches, these platforms report different nucleosome distribution patterns across the human genome. Our findings confound existing views and point to active cellular mechanisms which dynamically regulate histone modification levels and account for exon-intron marking. We believe that these histone modification patterns provide links between chromatin accessibility, Pol II movement and co-transcriptional splicing

    Allelic Gene Structure Variations in Anopheles gambiae Mosquitoes

    Get PDF
    Allelic gene structure variations and alternative splicing are responsible for transcript structure variations. More than 75% of human genes have structural isoforms of transcripts, but to date few studies have been conducted to verify the alternative splicing systematically.The present study used expressed sequence tags (ESTs) and EST tagged SNP patterns to examine the transcript structure variations resulting from allelic gene structure variations in the major human malaria vector, Anopheles gambiae. About 80% of 236,004 available A. gambiae ESTs were successfully aligned to A. gambiae reference genomes. More than 2,340 transcript structure variation events were detected. Because the current A. gambiae annotation is incomplete, we re-annotated the A. gambiae genome with an A. gambiae-specific gene model so that the effect of variations on gene coding could be better evaluated. A total of 15,962 genes were predicted. Among them, 3,873 were novel genes and 12,089 were previously identified genes. The gene completion rate improved from 60% to 84%. Based on EST support, 82.5% of gene structures were predicted correctly. In light of the new annotation, we found that approximately 78% of transcript structure variations were located within the coding sequence (CDS) regions, and >65% of variations in the CDS regions have the same open-reading-frame. The association between transcript structure isoforms and SNPs indicated that more than 28% of transcript structure variation events were contributed by different gene alleles in A. gambiae.We successfully expanded the A. gambiae genome annotation. We predicted and analyzed transcript structure variations in A. gambiae and found that allelic gene structure variation plays a major role in transcript diversity in this important human malaria vector

    Large-Scale Discovery and Characterization of Protein Regulatory Motifs in Eukaryotes

    Get PDF
    The increasing ability to generate large-scale, quantitative proteomic data has brought with it the challenge of analyzing such data to discover the sequence elements that underlie systems-level protein behavior. Here we show that short, linear protein motifs can be efficiently recovered from proteome-scale datasets such as sub-cellular localization, molecular function, half-life, and protein abundance data using an information theoretic approach. Using this approach, we have identified many known protein motifs, such as phosphorylation sites and localization signals, and discovered a large number of candidate elements. We estimate that ∼80% of these are novel predictions in that they do not match a known motif in both sequence and biological context, suggesting that post-translational regulation of protein behavior is still largely unexplored. These predicted motifs, many of which display preferential association with specific biological pathways and non-random positioning in the linear protein sequence, provide focused hypotheses for experimental validation

    Integrated Expression Profiling and ChIP-seq Analyses of the Growth Inhibition Response Program of the Androgen Receptor

    Get PDF
    Background: The androgen receptor (AR) plays important roles in the development of male phenotype and in different human diseases including prostate cancers. The AR can act either as a promoter or a tumor suppressor depending on cell types. The AR proliferative response program has been well studied, but its prohibitive response program has not yet been thoroughly studied. Methodology/Principal Findings: Previous studies found that PC3 cells expressing the wild-type AR inhibit growth and suppress invasion. We applied expression profiling to identify the response program of PC3 cells expressing the AR (PC3-AR) under different growth conditions (i.e. with or without androgens and at different concentration of androgens) and then applied the newly developed ChIP-seq technology to identify the AR binding regions in the PC3 cancer genome. A surprising finding was that the comparison of MOCK-transfected PC3 cells with AR-transfected cells identified 3,452 differentially expressed genes (two fold cutoff) even without the addition of androgens (i.e. in ethanol control), suggesting that a ligand independent activation or extremely low-level androgen activation of the AR. ChIP-Seq analysis revealed 6,629 AR binding regions in the cancer genome of PC3 cells with an FDR (false discovery rate) cut off of 0.05. About 22.4 % (638 o

    Genome-Wide Interrogation of Mammalian Stem Cell Fate Determinants by Nested Chromosome Deletions

    Get PDF
    Understanding the function of important DNA elements in mammalian stem cell genomes would be enhanced by the availability of deletion collections in which segmental haploidies are precisely characterized. Using a modified Cre-loxP–based system, we now report the creation and characterization of a collection of ∼1,300 independent embryonic stem cell (ESC) clones enriched for nested chromosomal deletions. Mapping experiments indicate that this collection spans over 25% of the mouse genome with good representative coverage of protein-coding genes, regulatory RNAs, and other non-coding sequences. This collection of clones was screened for in vitro defects in differentiation of ESC into embryoid bodies (EB). Several putative novel haploinsufficient regions, critical for EB development, were identified. Functional characterization of one of these regions, through BAC complementation, identified the ribosomal gene Rps14 as a novel haploinsufficient determinant of embryoid body formation. This new library of chromosomal deletions in ESC (DelES: http://bioinfo.iric.ca/deles) will serve as a unique resource for elucidation of novel protein-coding and non-coding regulators of ESC activity

    Variation analysis and gene annotation of eight MHC haplotypes: The MHC Haplotype Project

    Get PDF
    The human major histocompatibility complex (MHC) is contained within about 4 Mb on the short arm of chromosome 6 and is recognised as the most variable region in the human genome. The primary aim of the MHC Haplotype Project was to provide a comprehensively annotated reference sequence of a single, human leukocyte antigen-homozygous MHC haplotype and to use it as a basis against which variations could be assessed from seven other similarly homozygous cell lines, representative of the most common MHC haplotypes in the European population. Comparison of the haplotype sequences, including four haplotypes not previously analysed, resulted in the identification of >44,000 variations, both substitutions and indels (insertions and deletions), which have been submitted to the dbSNP database. The gene annotation uncovered haplotype-specific differences and confirmed the presence of more than 300 loci, including over 160 protein-coding genes. Combined analysis of the variation and annotation datasets revealed 122 gene loci with coding substitutions of which 97 were non-synonymous. The haplotype (A3-B7-DR15; PGF cell line) designated as the new MHC reference sequence, has been incorporated into the human genome assembly (NCBI35 and subsequent builds), and constitutes the largest single-haplotype sequence of the human genome to date. The extensive variation and annotation data derived from the analysis of seven further haplotypes have been made publicly available and provide a framework and resource for future association studies of all MHC-associated diseases and transplant medicine

    Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data

    Get PDF
    We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications

    Common Variants at 9p21 and 8q22 Are Associated with Increased Susceptibility to Optic Nerve Degeneration in Glaucoma

    Get PDF
    Optic nerve degeneration caused by glaucoma is a leading cause of blindness worldwide. Patients affected by the normal-pressure form of glaucoma are more likely to harbor risk alleles for glaucoma-related optic nerve disease. We have performed a meta-analysis of two independent genome-wide association studies for primary open angle glaucoma (POAG) followed by a normal-pressure glaucoma (NPG, defined by intraocular pressure (IOP) less than 22 mmHg) subgroup analysis. The single-nucleotide polymorphisms that showed the most significant associations were tested for association with a second form of glaucoma, exfoliation-syndrome glaucoma. The overall meta-analysis of the GLAUGEN and NEIGHBOR dataset results (3,146 cases and 3,487 controls) identified significant associations between two loci and POAG: the CDKN2BAS region on 9p21 (rs2157719 [G], OR = 0.69 [95%CI 0.63–0.75], p = 1.86×10−18), and the SIX1/SIX6 region on chromosome 14q23 (rs10483727 [A], OR = 1.32 [95%CI 1.21–1.43], p = 3.87×10−11). In sub-group analysis two loci were significantly associated with NPG: 9p21 containing the CDKN2BAS gene (rs2157719 [G], OR = 0.58 [95% CI 0.50–0.67], p = 1.17×10−12) and a probable regulatory region on 8q22 (rs284489 [G], OR = 0.62 [95% CI 0.53–0.72], p = 8.88×10−10). Both NPG loci were also nominally associated with a second type of glaucoma, exfoliation syndrome glaucoma (rs2157719 [G], OR = 0.59 [95% CI 0.41–0.87], p = 0.004 and rs284489 [G], OR = 0.76 [95% CI 0.54–1.06], p = 0.021), suggesting that these loci might contribute more generally to optic nerve degeneration in glaucoma. Because both loci influence transforming growth factor beta (TGF-beta) signaling, we performed a genomic pathway analysis that showed an association between the TGF-beta pathway and NPG (permuted p = 0.009). These results suggest that neuro-protective therapies targeting TGF-beta signaling could be effective for multiple forms of glaucoma

    Multi-messenger observations of a binary neutron star merger

    Get PDF
    On 2017 August 17 a binary neutron star coalescence candidate (later designated GW170817) with merger time 12:41:04 UTC was observed through gravitational waves by the Advanced LIGO and Advanced Virgo detectors. The Fermi Gamma-ray Burst Monitor independently detected a gamma-ray burst (GRB 170817A) with a time delay of ~1.7 s with respect to the merger time. From the gravitational-wave signal, the source was initially localized to a sky region of 31 deg2 at a luminosity distance of 40+8-8 Mpc and with component masses consistent with neutron stars. The component masses were later measured to be in the range 0.86 to 2.26 Mo. An extensive observing campaign was launched across the electromagnetic spectrum leading to the discovery of a bright optical transient (SSS17a, now with the IAU identification of AT 2017gfo) in NGC 4993 (at ~40 Mpc) less than 11 hours after the merger by the One- Meter, Two Hemisphere (1M2H) team using the 1 m Swope Telescope. The optical transient was independently detected by multiple teams within an hour. Subsequent observations targeted the object and its environment. Early ultraviolet observations revealed a blue transient that faded within 48 hours. Optical and infrared observations showed a redward evolution over ~10 days. Following early non-detections, X-ray and radio emission were discovered at the transient’s position ~9 and ~16 days, respectively, after the merger. Both the X-ray and radio emission likely arise from a physical process that is distinct from the one that generates the UV/optical/near-infrared emission. No ultra-high-energy gamma-rays and no neutrino candidates consistent with the source were found in follow-up searches. These observations support the hypothesis that GW170817 was produced by the merger of two neutron stars in NGC4993 followed by a short gamma-ray burst (GRB 170817A) and a kilonova/macronova powered by the radioactive decay of r-process nuclei synthesized in the ejecta

    Localization and broadband follow-up of the gravitational-wave transient GW150914

    Get PDF
    A gravitational-wave (GW) transient was identified in data recorded by the Advanced Laser Interferometer Gravitational-wave Observatory (LIGO) detectors on 2015 September 14. The event, initially designated G184098 and later given the name GW150914, is described in detail elsewhere. By prior arrangement, preliminary estimates of the time, significance, and sky location of the event were shared with 63 teams of observers covering radio, optical, near-infrared, X-ray, and gamma-ray wavelengths with ground- and space-based facilities. In this Letter we describe the low-latency analysis of the GW data and present the sky localization of the first observed compact binary merger. We summarize the follow-up observations reported by 25 teams via private Gamma-ray Coordinates Network circulars, giving an overview of the participating facilities, the GW sky localization coverage, the timeline, and depth of the observations. As this event turned out to be a binary black hole merger, there is little expectation of a detectable electromagnetic (EM) signature. Nevertheless, this first broadband campaign to search for a counterpart of an Advanced LIGO source represents a milestone and highlights the broad capabilities of the transient astronomy community and the observing strategies that have been developed to pursue neutron star binary merger events. Detailed investigations of the EM data and results of the EM follow-up campaign are being disseminated in papers by the individual teams
    corecore