51 research outputs found
A study of the distribution of phylogenetically conserved blocks within clusters of mammalian homeobox genes
Genome sequencing efforts of the last decade have produced a large amount of data, which has enabled whole-genome comparative analyses in order to locate potentially functional elements and study the overall patterns of phylogenetic conservation. In this paper we present a statistically based method for the characterization of these patterns in mammalian DNA sequences. We have applied this approach to the study of exceptionally well conserved homeobox gene clusters (Hox), based on an alignment of six species, and we have constructed a map of Hox cataloguing the conserved fragments, along with their locations in relation to the genes and other landmarks, sometimes showing unexpected layouts
Association between the ACCN1 Gene and Multiple Sclerosis in Central East Sardinia
Multiple genome screens have been performed to identify regions in linkage or association with Multiple Sclerosis (MS, OMIM 126200), but little overlap has been found among them. This may be, in part, due to a low statistical power to detect small genetic effects and to genetic heterogeneity within and among the studied populations. Motivated by these considerations, we studied a very special population, namely that of Nuoro, Sardinia, Italy. This is an isolated, old, and genetically homogeneous population with high prevalence of MS. Our study sample includes both nuclear families and unrelated cases and controls. A multi-stage study design was adopted. In the first stage, microsatellites were typed in the 17q11.2 region, previously independently found to be in linkage with MS. One significant association was found at microsatellite D17S798. Next, a bioinformatic screening of the region surrounding this marker highlighted an interesting candidate MS susceptibility gene: the Amiloride-sensitive Cation Channel Neuronal 1 (ACCN1) gene. In the second stage of the study, we resequenced the exons and the 3′ untranslated (UTR) region of ACCN1, and investigated the MS association of Single Nucleotide Polymorphisms (SNPs) identified in that region. For this purpose, we developed a method of analysis where complete, phase-solved, posterior-weighted haplotype assignments are imputed for each study individual from incomplete, multi-locus, genotyping data. The imputed assignments provide an input to a number of proposed procedures for testing association at a microsatellite level or of a sequence of SNPs. These include a Mantel-Haenszel type test based on expected frequencies of pseudocase/pseudocontrol haplotypes, as well as permutation based tests, including a combination of permutation and weighted logistic regression analysis. Application of these methods allowed us to find a significant association between MS and the SNP rs28936 located in the 3′ UTR segment of ACCN1 with p = 0.0004 (p = 0.002, after adjusting for multiple testing). This result is in tune with several recent experimental findings which suggest that ACCN1 may play an important role in the pathogenesis of MS
De Novo Unbalanced Translocations in Prader-Willi and Angelman Syndrome Might Be the Reciprocal Product of inv dup(15)s
The 15q11-q13 region is characterized by high instability, caused by the presence of several paralogous segmental duplications. Although most mechanisms dealing with cryptic deletions and amplifications have been at least partly characterized, little is known about the rare translocations involving this region. We characterized at the molecular level five unbalanced translocations, including a jumping one, having most of 15q transposed to the end of another chromosome, whereas the der(15)(pter->q11-q13) was missing. Imbalances were associated either with Prader-Willi or Angelman syndrome. Array-CGH demonstrated the absence of any copy number changes in the recipient chromosome in three cases, while one carried a cryptic terminal deletion and another a large terminal deletion, already diagnosed by classical cytogenetics. We cloned the breakpoint junctions in two cases, whereas cloning was impaired by complex regional genomic architecture and mosaicism in the others. Our results strongly indicate that some of our translocations originated through a prezygotic/postzygotic two-hit mechanism starting with the formation of an acentric 15qter->q1::q1->qter representing the reciprocal product of the inv dup(15) supernumerary marker chromosome. An embryo with such an acentric chromosome plus a normal chromosome 15 inherited from the other parent could survive only if partial trisomy 15 rescue would occur through elimination of part of the acentric chromosome, stabilization of the remaining portion with telomere capture, and formation of a derivative chromosome. All these events likely do not happen concurrently in a single cell but are rather the result of successive stabilization attempts occurring in different cells of which only the fittest will finally survive. Accordingly, jumping translocations might represent successful rescue attempts in different cells rather than transfer of the same 15q portion to different chromosomes. We also hypothesize that neocentromerization of the original acentric chromosome during early embryogenesis may be required to avoid its loss before cell survival is finally assured
Functional impact and evolution of a novel human polymorphic inversion that disrupts a gene and creates a fusion transcript
Since the discovery of chromosomal inversions almost 100 years ago, how they are maintained in natural populations has been a highly debated issue. One of the hypotheses is that inversion breakpoints could affect genes and modify gene expression levels, although evidence of this came only from laboratory mutants. In humans, a few inversions have been shown to associate with expression differences, but in all cases the molecular causes have remained elusive. Here, we have carried out a complete characterization of a new human polymorphic inversion and determined that it is specific to East Asian populations. In addition, we demonstrate that it disrupts the ZNF257 gene and, through the translocation of the first exon and regulatory sequences, creates a previously nonexistent fusion transcript, which together are associated to expression changes in several other genes. Finally, we investigate the potential evolutionary and phenotypic consequences of the inversion, and suggest that it is probably deleterious. This is therefore the first example of a natural polymorphic inversion that has position effects and creates a new chimeric gene, contributing to answer an old question in evolutionary biology
Organizational Heterogeneity of Vertebrate Genomes
Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as “texts” using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter - GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences
GWAS meta-analysis reveals novel loci and genetic correlates for general cognitive function : a report from the COGENT consortium
CORRIGENDUM Molecular Psychiatry (2017) 22, 1651–1652 http://www.nature.com/articles/mp2017197.pdfThe complex nature of human cognition has resulted in cognitive genomics lagging behind many other fields in terms of gene discovery using genome-wide association study (GWAS) methods. In an attempt to overcome these barriers, the current study utilized GWAS meta-analysis to examine the association of common genetic variation (similar to 8M single-nucleotide polymorphisms (SNP) with minor allele frequency >= 1%) to general cognitive function in a sample of 35 298 healthy individuals of European ancestry across 24 cohorts in the Cognitive Genomics Consortium (COGENT). In addition, we utilized individual SNP lookups and polygenic score analyses to identify genetic overlap with other relevant neurobehavioral phenotypes. Our primary GWAS meta-analysis identified two novel SNP loci (top SNPs: rs76114856 in the CENPO gene on chromosome 2 and rs6669072 near LOC105378853 on chromosome 1) associated with cognitive performance at the genome-wide significance level (PPeer reviewe
Alternate-locus aware variant calling in whole genome sequencing
BACKGROUND: The last two human genome assemblies have extended the previous linear golden-path paradigm of the human genome to a graph-like model to better represent regions with a high degree of structural variability. The new model offers opportunities to improve the technical validity of variant calling in whole-genome sequencing (WGS). METHODS: We developed an algorithm that analyzes the patterns of variant calls in the 178 structurally variable regions of the GRCh38 genome assembly, and infers whether a given sample is most likely to contain sequences from the primary assembly, an alternate locus, or their heterozygous combination at each of these 178 regions. We investigate 121 in-house WGS datasets that have been aligned to the GRCh37 and GRCh38 assemblies. RESULTS: We show that stretches of sequences that are largely but not entirely identical between the primary assembly and an alternate locus can result in multiple variant calls against regions of the primary assembly. In WGS analysis, this results in characteristic and recognizable patterns of variant calls at positions that we term alignable scaffold-discrepant positions (ASDPs). In 121 in-house genomes, on average 51.8±3.8 of the 178 regions were found to correspond best to an alternate locus rather than the primary assembly sequence, and filtering these genomes with our algorithm led to the identification of 7863 variant calls per genome that colocalized with ASDPs. Additionally, we found that 437 of 791 genome-wide association study hits located within one of the regions corresponded to ASDPs. CONCLUSIONS: Our algorithm uses the information contained in the 178 structurally variable regions of the GRCh38 genome assembly to avoid spurious variant calls in cases where samples contain an alternate locus rather than the corresponding segment of the primary assembly. These results suggest the great potential of fully incorporating the resources of graph-like genome assemblies into variant calling, but also underscore the importance of developing computational resources that will allow a full reconstruction of the genotype in personal genomes. Our algorithm is freely available at https://github.com/charite/asdpex. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13073-016-0383-z) contains supplementary material, which is available to authorized users
- …