98 research outputs found

    A Downstream CpG Island Controls Transcript Initiation and Elongation and the Methylation State of the Imprinted Airn Macro ncRNA Promoter

    Get PDF
    A CpG island (CGI) lies at the 5β€² end of the Airn macro non-protein-coding (nc) RNA that represses the flanking Igf2r promoter in cis on paternally inherited chromosomes. In addition to being modified on maternally inherited chromosomes by a DNA methylation imprint, the Airn CGI shows two unusual organization features: its position immediately downstream of the Airn promoter and transcription start site and a series of tandem direct repeats (TDRs) occupying its second half. The physical separation of the Airn promoter from the CGI provides a model to investigate if the CGI plays distinct transcriptional and epigenetic roles. We used homologous recombination to generate embryonic stem cells carrying deletions at the endogenous locus of the entire CGI or just the TDRs. The deleted Airn alleles were analyzed by using an ES cell imprinting model that recapitulates the onset of Igf2r imprinted expression in embryonic development or by using knock-out mice. The results show that the CGI is required for efficient Airn initiation and to maintain the unmethylated state of the Airn promoter, which are both necessary for Igf2r repression on the paternal chromosome. The TDRs occupying the second half of the CGI play a minor role in Airn transcriptional elongation or processivity, but are essential for methylation on the maternal Airn promoter that is necessary for Igf2r to be expressed from this chromosome. Together the data indicate the existence of a class of regulatory CGIs in the mammalian genome that act downstream of the promoter and transcription start

    Comparative Analysis of DNA Replication Timing Reveals Conserved Large-Scale Chromosomal Architecture

    Get PDF
    Recent evidence suggests that the timing of DNA replication is coordinated across megabase-scale domains in metazoan genomes, yet the importance of this aspect of genome organization is unclear. Here we show that replication timing is remarkably conserved between human and mouse, uncovering large regions that may have been governed by similar replication dynamics since these species have diverged. This conservation is both tissue-specific and independent of the genomic G+C content conservation. Moreover, we show that time of replication is globally conserved despite numerous large-scale genome rearrangements. We systematically identify rearrangement fusion points and demonstrate that replication time can be locally diverged at these loci. Conversely, rearrangements are shown to be correlated with early replication and physical chromosomal proximity. These results suggest that large chromosomal domains of coordinated replication are shuffled by evolution while conserving the large-scale nuclear architecture of the genome

    ChIP-seq analysis reveals distinct H3K27me3 profiles that correlate with transcriptional activity

    Get PDF
    Transcriptional control is dependent on a vast network of epigenetic modifications. One epigenetic mark of particular interest is tri-methylation of lysine 27 on histone H3 (H3K27me3), which is catalysed and maintained by Polycomb Repressive Complex 2 (PRC2). Although this histone mark is studied widely, the precise relationship between its local pattern of enrichment and regulation of gene expression is currently unclear. We have used ChIP-seq to generate genome-wide maps of H3K27me3 enrichment, and have identified three enrichment profiles with distinct regulatory consequences. First, a broad domain of H3K27me3 enrichment across the body of genes corresponds to the canonical view of H3K27me3 as inhibitory to transcription. Second, a peak of enrichment around the transcription start site (TSS) is commonly associated with β€˜bivalent’ genes, where H3K4me3 also marks the TSS. Finally and most surprisingly, we identified an enrichment profile with a peak in the promoter of genes that is associated with active transcription. Genes with each of these three profiles were found in different proportions in each of the cell types studied. The data analysis techniques developed here will be useful for the identification of common enrichment profiles for other histone modifications that have important consequences for transcriptional regulation

    Effects of Blood Collection Conditions on Ovarian Cancer Serum Markers

    Get PDF
    Evaluating diagnostic and early detection biomarkers requires comparing serum protein concentrations among biosamples ascertained from subjects with and without cancer. Efforts are generally made to standardize blood processing and storage conditions for cases and controls, but blood sample collection conditions cannot be completely controlled. For example, blood samples from cases are often obtained from persons aware of their diagnoses, and collected after fasting or in surgery, whereas blood samples from some controls may be obtained in different conditions, such as a clinic visit. By measuring the effects of differences in collection conditions on three different markers, we investigated the potential of these effects to bias validation studies.We analyzed serum concentrations of three previously studied putative ovarian cancer serum biomarkers-CA 125, Prolactin and MIF-in healthy women, women with ovarian cancer undergoing gynecologic surgery, women undergoing surgery for benign ovary pathology, and women undergoing surgery with pathologically normal ovaries. For women undergoing surgery, a blood sample was collected either in the clinic 1 to 39 days prior to surgery, or on the day of surgery after anesthesia was administered but prior to the surgical procedure, or both. We found that one marker, prolactin, was dramatically affected by collection conditions, while CA 125 and MIF were unaffected. Prolactin levels were not different between case and control groups after accounting for the conditions of sample collection, suggesting that sample ascertainment could explain some or all of the previously reported results about its potential as a biomarker for ovarian cancer.Biomarker validation studies should use standardized collection conditions, use multiple control groups, and/or collect samples from cases prior to influence of diagnosis whenever feasible to detect and correct for potential biases associated with sample collection

    The Characterisation of Three Types of Genes that Overlie Copy Number Variable Regions

    Get PDF
    Background: Due to the increased accuracy of Copy Number Variable region (CNV) break point mapping, it is now possible to say with a reasonable degree of confidence whether a gene (i) falls entirely within a CNV; (ii) overlaps the CNV or (iii) actually contains the CNV. We classify these as type I, II and III CNV genes respectively. Principal Findings: Here we show that although type I genes vary in copy number along with the CNV, most of these type I genes have the same expression levels as wild type copy numbers of the gene. These genes must, therefore, be under homeostatic dosage compensation control. Looking into possible mechanisms for the regulation of gene expression we found that type I genes have a significant paucity of genes regulated by miRNAs and are not significantly enriched for monoallelically expressed genes. Type III genes, on the other hand, have a significant excess of genes regulated by miRNAs and are enriched for genes that are monoallelically expressed. Significance: Many diseases and genomic disorders are associated with CNVs so a better understanding of the different ways genes are associated with normal CNVs will help focus on candidate genes in genome wide association studies

    Mixture of latent trait analyzers for model-based clustering of categorical data

    Get PDF
    Model-based clustering methods for continuous data are well established and commonly used in a wide range of applications. However, model-based clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for model-based clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a correspondence between the estimated latent classes and groups in the population of interest. The mixture of latent trait analyzers model extends latent class analysis by assuming a model for the categorical response variables that depends on both a categorical latent class and a continuous latent trait variable; the discrete latent class accommodates group structure and the continuous latent trait accommodates dependence within these groups. Fitting the mixture of latent trait analyzers model is potentially difficult because the likelihood function involves an integral that cannot be evaluated analytically. We develop a variational approach for fitting the mixture of latent trait models and this provides an efficient model fitting strategy. The mixture of latent trait analyzers model is demonstrated on the analysis of data from the National Long Term Care Survey (NLTCS) and voting in the U.S. Congress. The model is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone

    Regulatory RNAs and chromatin modification in dosage compensation: A continuous path from flies to humans?

    Get PDF
    Chromosomal sex determination is a widely distributed strategy in nature. In the most classic scenario, one sex is characterized by a homologue pair of sex chromosomes, while the other includes two morphologically and functionally distinct gonosomes. In mammalian diploid cells, the female is characterized by the presence of two identical X chromosomes, while the male features an XY pair, with the Y bearing the major genetic determinant of sex, i.e. the SRY gene. In other species, such as the fruitfly, sex is determined by the ratio of autosomes to X chromosomes. Regardless of the exact mechanism, however, all these animals would exhibit a sex-specific gene expression inequality, due to the different number of X chromosomes, a phenomenon inhibited by a series of genetic and epigenetic regulatory events described as "dosage compensation". Since adequate available data is currently restricted to worms, flies and mammals, while for other groups of animals, such as reptiles, fish and birds it is very limited, it is not yet clear whether this is an evolutionary conserved mechanism. However certain striking similarities have already been observed among evolutionary distant species, such as Drosophila melanogaster and Mus musculus. These mainly refer to a) the need for a counting mechanism, to determine the chromosomal content of the cell, i.e. the ratio of autosomes to gonosomes (a process well understood in flies, but still hypothesized in mammals), b) the implication of non-translated, sex-specific, regulatory RNAs (roX and Xist, respectively) as key elements in this process and the location of similar mediators in the Z chromosome of chicken c) the inclusion of a chromatin modification epigenetic final step, which ensures that gene expression remains stably regulated throughout the affected area of the gonosome. This review summarizes these points and proposes a possible role for comparative genetics, as they seem to constitute proof of maintained cell economy (by using the same basic regulatory elements in various different scenarios) throughout numerous centuries of evolutionary history

    DNaseI Hypersensitivity and Ultraconservation Reveal Novel, Interdependent Long-Range Enhancers at the Complex Pax6 Cis-Regulatory Region

    Get PDF
    The PAX6 gene plays a crucial role in development of the eye, brain, olfactory system and endocrine pancreas. Consistent with its pleiotropic role the gene exhibits a complex developmental expression pattern which is subject to strict spatial, temporal and quantitative regulation. Control of expression depends on a large array of cis-elements residing in an extended genomic domain around the coding region of the gene. The minimal essential region required for proper regulation of this complex locus has been defined through analysis of human aniridia-associated breakpoints and YAC transgenic rescue studies of the mouse smalleye mutant. We have carried out a systematic DNase I hypersensitive site (HS) analysis across 200 kb of this critical region of mouse chromosome 2E3 to identify putative regulatory elements. Mapping the identified HSs onto a percent identity plot (PIP) shows many HSs correspond to recognisable genomic features such as evolutionarily conserved sequences, CpG islands and retrotransposon derived repeats. We then focussed on a region previously shown to contain essential long range cis-regulatory information, the Pax6 downstream regulatory region (DRR), allowing comparison of mouse HS data with previous human HS data for this region. Reporter transgenic mice for two of the HS sites, HS5 and HS6, show that they function as tissue specific regulatory elements. In addition we have characterised enhancer activity of an ultra-conserved cis-regulatory region located near Pax6, termed E60. All three cis-elements exhibit multiple spatio-temporal activities in the embryo that overlap between themselves and other elements in the locus. Using a deletion set of YAC reporter transgenic mice we demonstrate functional interdependence of the elements. Finally, we use the HS6 enhancer as a marker for the migration of precerebellar neuro-epithelium cells to the hindbrain precerebellar nuclei along the posterior and anterior extramural streams allowing visualisation of migratory defects in both pathways in Pax6(Sey/Sey) mice

    Genomic and Transcriptional Co-Localization of Protein-Coding and Long Non-Coding RNA Pairs in the Developing Brain

    Get PDF
    Besides protein-coding mRNAs, eukaryotic transcriptomes include many long non-protein-coding RNAs (ncRNAs) of unknown function that are transcribed away from protein-coding loci. Here, we have identified 659 intergenic long ncRNAs whose genomic sequences individually exhibit evolutionary constraint, a hallmark of functionality. Of this set, those expressed in the brain are more frequently conserved and are significantly enriched with predicted RNA secondary structures. Furthermore, brain-expressed long ncRNAs are preferentially located adjacent to protein-coding genes that are (1) also expressed in the brain and (2) involved in transcriptional regulation or in nervous system development. This led us to the hypothesis that spatiotemporal co-expression of ncRNAs and nearby protein-coding genes represents a general phenomenon, a prediction that was confirmed subsequently by in situ hybridisation in developing and adult mouse brain. We provide the full set of constrained long ncRNAs as an important experimental resource and present, for the first time, substantive and predictive criteria for prioritising long ncRNA and mRNA transcript pairs when investigating their biological functions and contributions to development and disease
    • …