8 research outputs found
A haplotype information theory method reveals genes of evolutionary interest in European vs. Asian pigs
Asian and European wild boars were independently domesticated ca. 10,000 years ago. Since the 17th century, Chinese breeds have been imported to Europe to improve the genetics of European animals by introgression of favourable alleles, resulting in a complex mosaic of haplotypes. To interrogate the structure of these haplotypes further, we have run a new haplotype segregation analysis based on information theory, namely compression efficiency (CE). We applied the approach to sequence data from individuals from each phylogeographic region (n = 23 from Asia and Europe) including a number of major pig breeds. Our genome-wide CE is able to discriminate the breeds in a manner reflecting phylogeography. Furthermore, 24,956 non-overlapping sliding windows (each comprising 1,000 consecutive SNP) were quantified for extent of haplotype sharing within and between Asia and Europe. The genome-wide distribution of extent of haplotype sharing was quite different between groups. Unlike European pigs, Asian pigs haplotype sharing approximates a normal distribution. In line with this, we found the European breeds possessed a number of genomic windows of dramatically higher haplotype sharing than the Asian breeds. Our CE analysis of sliding windows capture some of the genomic regions reported to contain signatures of selection in domestic pigs. Prominent among these regions, we highlight the role of a gene encoding the mitochondrial enzyme LACTB which has been associated with obesity, and the gene encoding MYOG a fundamental transcriptional regulator of myogenesis. The origin of these regions likely reflects either a population bottleneck in European animals, or selective targets on commercial phenotypes reducing allelic diversity in particular genes and/or regulatory regions
Identification of Lineage-Specific Cis-Regulatory Modules Associated with Variation in Transcription Factor Binding and Chromatin Activity Using Ornstein-Uhlenbeck Models
Scoring the impact of noncoding variation on the function of cis-regulatory regions, on their chromatin state, and on the qualitative and quantitative expression levels of target genes is a fundamental problem in evolutionary genomics. A particular challenge is how to model the divergence of quantitative traits and to identify relationships between the changes across the different levels of the genome, the chromatin activity landscape, and the transcriptome. Here, we examine the use of the Ornstein-Uhlenbeck (OU) model to infer selection at the level of predicted cis-regulatory modules (CRMs), and link these with changes in transcription factor binding and chromatin activity. Using publicly available cross-species ChIP-Seq and STARR-Seq data we show how OU can be applied genome-wide to identify candidate transcription factors for which binding site and CRM turnover is correlated with changes in regulatory activity. Next, we profile open chromatin in the developing eye across three Drosophila species. We identify the recognition motifs of the chromatin remodelers, Trithorax-like and Grainyhead as mostly correlating with species-specific changes in open chromatin. In conclusion, we show in this study that CRM scores can be used as quantitative traits and that motif discovery approaches can be extended towards more complex models of divergence.status: publishe
Micrococcal nuclease sequencing of pig sperm suggests a relationship between nucleosome retention and both semen quality and early embryo development
Resumen del póster presentado a la 38th International Conference on Animal Genetics (ISAG), celebrada virtualmente del 26 al 30 de julio de 2021.In animals, the chromatin structure of the mature spermatozoon is ultra-compacted due to the replacement of histones by protamines during spermatogenesis. However, a small fraction of nucleosomes remains bound to DNA at specific sites of the genome and it has been linked to sperm biology and embryogenesis. The genomic characterization of nucleosome occupancy in the sperm chromatin could help identifying molecular markers for sperm quality and fertility traits. Nonetheless, these maps are not yet available for most livestock species, including swine. In this study, we performed micrococcal nuclease digestion followed by high-throughput sequencing on pig ejaculated spermatozoa and mapped the mono-nucleosomal and sub-nucleosomal chromatin fractions. We found 25,293 mono-nucleosomal and 4,239 sub-nucleosomal peaks covering 0.3% and 0.02% of the porcine genome, respectively. We detected positional conservation of the nucleosome-associated DNAs in sperm between human and pig. We also carried gene ontology analysis of the genes mapping nearby the mono-nucleosomal peaks and also searched for putative transcription factor binding motifs within the mono-nucleosomal peaks and found an enrichment for sperm function and embryo development-related processes. Remarkably, we detected enrichment for the canonical binding site of Znf263. In humans, this transcription factor has been suggested as a key regulator of the genes with paternal preferential expression during early embryo development. In addition, we also observed co-occupancy of the RNAs present in pig sperm and these RNAs related to sperm quality, with the mono-nucleosomal peaks. We also found a co-location trend between GWAS hits for semen quality in swine and the mono-nucleosomal sites. The results obtained in this study clearly indicate that there is a relationship between nucleosome positioning in sperm with sperm phenotypes and embryo development.Peer reviewe
Nuclear receptors connect progenitor transcription factors to cell cycle control
The specification and growth of organs is controlled simultaneously by networks of transcription factors. While the connection between these transcription factors with fate determinants is increasingly clear, how they establish the link with the cell cycle is far less understood. Here we investigate this link in the developing Drosophila eye, where two transcription factors, the MEIS1 homologue hth and the Zn-finger tsh, synergize to stimulate the proliferation of naïve eye progenitors. Experiments combining transcriptomics, open-chromatin profiling, motif analysis and functional assays indicate that these progenitor transcription factors exert a global regulation of the proliferation program. Rather than directly regulating cell cycle genes, they control proliferation through an intermediary layer of nuclear receptors of the ecdysone/estrogen-signaling pathway. This regulatory subnetwork between hth, tsh and nuclear receptors might be conserved from Drosophila to mammals, as we find a significant co-overexpression of their human homologues in specific cancer type
Genome-wide co-expression distributions as a metric to prioritize genes of functional importance
Genome-wide gene expression analysis are routinely used to gain a systems-level understanding of complex processes, including network connectivity. Network connectivity tends to be built on a small subset of extremely high co-expression signals that are deemed significant, but this overlooks the vast majority of pairwise signals. Here, we developed a computational pipeline to assign to every gene its pair-wise genome-wide co-expression distribution to one of 8 template distributions shapes varying between unimodal, bimodal, skewed, or symmetrical, representing different proportions of positive and negative correlations. We then used a hypergeometric test to determine if specific genes (regulators versus non-regulators) and properties (differentially expressed or not) are associated with a particular distribution shape. We applied our methodology to five publicly available RNA sequencing (RNA-seq) datasets from four organisms in different physiological conditions and tissues. Our results suggest that genes can be assigned consistently to pre-defined distribution shapes, regarding the enrichment of differential expression and regulatory genes, in situations involving contrasting phenotypes, time-series, or physiological baseline data. There is indeed a striking additional biological signal present in the genome-wide distribution of co-expression values which would be overlooked by currently adopted approaches. Our method can be applied to extract further information from transcriptomic data and help uncover the molecular mechanisms involved in the regulation of complex biological process and phenotypes
Association analysis of loci implied in "buffering" epistasis
The existence of buffering mechanisms is an emerging property of biological networks, and this results in the buildup of robustness through evolution. So far, there are no explicit methods to find loci implied in buffering mechanisms. However, buffering can be seen as interaction with genetic background. Here we develop this idea into a tractable model for quantitative genetics, in which the buffering effect of one locus with many other loci is condensed into a single statistical effect, multiplicative on the total additive genetic effect. This allows easier interpretation of the results and simplifies the problem of detecting epistasis from quadratic to linear in the number of loci. Using this formulation, we construct a linear model for genome-wide association studies that estimates and declares the significance of multiplicative epistatic effects at single loci. The model has the form of a variance components, norm reaction model and likelihood ratio tests are used for significance. This model is a generalization and explanation of previous ones. We test our model using bovine data: Brahman and Tropical Composite animals, phenotyped for body weight at yearling and genotyped at high density. After association analysis, we find a number of loci with buffering action in one, the other, or both breeds; these loci do not have a significant statistical additive effect. Most of these loci have been reported in previous studies, either with an additive effect or as footprints of selection. We identify buffering epistatic SNPs present in or near genes reported in the context of signatures of selection in multi-breed cattle population studies. Prominent among these genes are those associated with fertility (INHBA, TSHR, ESRRG, PRLR, and PPARG), growth (MSTN, GHR), coat characteristics (KIT, MITF, PRLR), and heat resistance (HSPA6 and HSPA1A). In these populations, we found loci that have a nonsignificant statistical additive effect but a significant epistatic effect. We argue that the discovery and study of loci associated with buffering effects allow attacking the difficult problems, among others, of the release of maintenance variance in artificial and natural selection, of quick adaptation to the environment, and of opposite signs of marker effects in different backgrounds. We conclude that our method and our results generate promising new perspectives for research in evolutionary and quantitative genetics based on the study of loci that buffer effect of other loci