23 research outputs found
Evidence that APP gene copy number changes reflect recombinant vector contamination [preprint]
Mutations that occur in cells of the body, called somatic mutations, cause human diseases including cancer and some neurological disorders1. In a recent study published in Nature, Lee et al.2 (hereafter “the Lee study”) reported somatic copy number gains of the APP gene, a known risk locus of Alzheimer’s disease (AD), in the neurons of AD-patients and controls (69% vs 25% of neurons with at least one APP copy gain on average). The authors argue that the mechanism of these copy number gains was somatic integration of APP mRNA into the genome, creating what they called genomic cDNA (gencDNA). We reanalyzed the data from the Lee study, revealing evidence that APP gencDNA originates mainly from contamination by exogenous APP recombinant vectors, rather from true somatic retrotransposition of endogenous APP. Our reanalysis of two recent whole exome sequencing (WES) datasets—one by the authors of the Lee study3 and the other by Park et al.4—revealed that reads claimed to support APP gencDNA in AD samples resulted from contamination by PCR products and mRNA, respectively. Lastly, we present our own single-cell whole genome sequencing (scWGS) data that show no evidence for somatic APP retrotransposition in AD neurons or in neurons from normal individuals of various ages
Quantitative learning strategies based on word networks
Learning English requires a considerable effort, but the way that vocabulary is introduced in textbooks is not optimized for learning efficiency. With the increasing population of English learners, learning process optimization will have significant impact and improvement towards English learning and teaching. The recent developments of big data analysis and complex network science provide additional opportunities to design and further investigate the strategies in English learning. In this paper, quantitative English learning strategies based on word network and word usage information are proposed. The strategies integrate the words frequency with topological structural information. By analyzing the influence of connected learned words, the learning weights for the unlearned words and dynamically updating of the network are studied and analyzed. The results suggest that quantitative strategies significantly improve learning efficiency while maintaining effectiveness. Especially, the optimized-weight-first strategy and segmented strategies outperform other strategies. The results provide opportunities for researchers and practitioners to reconsider the way of English teaching and designing vocabularies quantitatively by balancing the efficiency and learning costs based on the word network
Schizophrenia-associated somatic copy-number variants from 12,834 cases reveal recurrent NRXN1 and ABCB11 disruptions
While germline copy-number variants (CNVs) contribute to schizophrenia (SCZ) risk, the contribution of somatic CNVs (sCNVs)—present in some but not all cells—remains unknown. We identified sCNVs using blood-derived genotype arrays from 12,834 SCZ cases and 11,648 controls, filtering sCNVs at loci recurrently mutated in clonal blood disorders. Likely early-developmental sCNVs were more common in cases (0.91%) than controls (0.51%, p = 2.68e−4), with recurrent somatic deletions of exons 1–5 of the NRXN1 gene in five SCZ cases. Hi-C maps revealed ectopic, allele-specific loops forming between a potential cryptic promoter and non-coding cis-regulatory elements upon 5′ deletions in NRXN1. We also observed recurrent intragenic deletions of ABCB11, encoding a transporter implicated in anti-psychotic response, in five treatment-resistant SCZ cases and showed that ABCB11 is specifically enriched in neurons forming mesocortical and mesolimbic dopaminergic projections. Our results indicate potential roles of sCNVs in SCZ risk
MosaicHunter: accurate detection of postzygotic single-nucleotide mosaicism through next-generation sequencing of unpaired, trio, and paired samples
Distinctive types of postzygotic single-nucleotide mosaicisms in healthy individuals revealed by genome-wide profiling of multiple organs
<div><p>Postzygotic single-nucleotide mosaicisms (pSNMs) have been extensively studied in tumors and are known to play critical roles in tumorigenesis. However, the patterns and origin of pSNMs in normal organs of healthy humans remain largely unknown. Using whole-genome sequencing and ultra-deep amplicon re-sequencing, we identified and validated 164 pSNMs from 27 postmortem organ samples obtained from five healthy donors. The mutant allele fractions ranged from 1.0% to 29.7%. Inter- and intra-organ comparison revealed two distinctive types of pSNMs, with about half originating during early embryogenesis (embryonic pSNMs) and the remaining more likely to result from clonal expansion events that had occurred more recently (clonal expansion pSNMs). Compared to clonal expansion pSNMs, embryonic pSNMs had higher proportion of C>T mutations with elevated mutation rate at CpG sites. We observed differences in replication timing between these two types of pSNMs, with embryonic and clonal expansion pSNMs enriched in early- and late-replicating regions, respectively. An increased number of embryonic pSNMs were located in open chromatin states and topologically associating domains that transcribed embryonically. Our findings provide new insights into the origin and spatial distribution of postzygotic mosaicism during normal human development.</p></div
Two types of pSNMs revealed by inter- and intra-organ profiles.
<p>(A) Minor allele fractions between different categories of pSNM. The organ-shared pSNMs demonstrated significantly higher allele fractions than the organ-unique pSNMs. (B) Number of pSNMs carried in different organ samples. Red and blue bars denote the organ-shared and organ-unique pSNMs, respectively. An excess of organ-unique pSNMs was observed in the breast sample of BBL11121 and the liver sample of BBLD1005. (C-D) Heatmap of minor allele fractions for pSNMs carried in multiple organ samples of BBLD1005 (C) and BBL11121 (D). Blood samples of two unrelated individuals (ACC1 and ACC4) served as negative controls. The color intensity of each tile represents allele fractions estimated by targeted ultra-depth resequencing. Gray tiles denote sites without sufficient read depth (< 30X). Red and blue bars denote the organ-shared and organ-unique pSNMs, respectively. The majority of organ-unique pSNMs were locally restricted to one or a few physically adjacent organ samples (<1 cm). (E-F) Relative intra-organ similarity of pSNM profiles in BBLD1005 (E) and BBL11121 (F). The originally sequenced samples (liver #9 and breast #9) shared the largest similarity to their physically closest samples.</p
Donor and sequencing information regarding the 27 post-mortem organ samples profiled in this study.
<p>Donor and sequencing information regarding the 27 post-mortem organ samples profiled in this study.</p
Identification and validation of pSNMs in 27 organ samples obtained from five individuals.
<p>(A) Genomic landscape of the validated pSNMs. Circos plots from outer to inner represent individuals BBL1100C, BBL11121, BBLC1013, BBLD1005, and BBLD1010, respectively. The Y axis denotes the average minor allele fraction across pSNM-carrying organs of each donor. (B) Correlation of the allele fractions estimated by whole-genome sequencing and targeted ultra-deep resequencing (PASM) of the validated sites. The shape, color, and size of the dots represent the donor and organ type carrying the pSNMs and the site-specific depth of whole-genome sequencing.</p
Recommended from our members
MosaicBase: A Knowledgebase of Postzygotic Mosaic Variants in Noncancer Disease-related and Healthy Human Individuals
Mosaic variants resulting from postzygotic mutations are prevalent in the human genome and play important roles in human diseases. However, except for cancer-related variants, there is no collection of postzygotic mosaic variants in noncancer disease-related and healthy individuals. Here, we present MosaicBase, a comprehensive database that includes 6698 mosaic variants related to 266 noncancer diseases and 27,991 mosaic variants identified in 422 healthy individuals. Genomic and phenotypic information of each variant was manually extracted and curated from 383 publications. MosaicBase supports the query of variants with Online Mendelian Inheritance in Man (OMIM) entries, genomic coordinates, gene symbols, or Entrez IDs. We also provide an integrated genome browser for users to easily access mosaic variants and their related annotations for any genomic region. By analyzing the variants collected in MosaicBase, we find that mosaic variants that directly contribute to disease phenotype show features distinct from those of variants in individuals with mild or no phenotypes, in terms of their genomic distribution, mutation signatures, and fraction of mutant cells. MosaicBase will not only assist clinicians in genetic counseling and diagnosis but also provide a useful resource to understand the genomic baseline of postzygotic mutations in the general human population. MosaicBase is publicly available at http://mosaicbase.com/ or http://49.4.21.8:8000
Distinct genomic characteristics between embryonic and clonal expansion pSNMs.
<p>(A-C) Mutation spectrums in NpG and non-NpG sites for embryonic pSNMs (A), clonal expansion pSNMs in BBLD1005’s liver (B), and clonal expansion pSNMs in BBL11121’s breast (C). Mutation rate was normalized by the total number of sites in the human genome. For embryonic pSNMs, CpG sites showed significantly higher rate of C>T mutations than non-CpG sites. (D) Varied DNA replication timing of the two pSNMs types. The grey line denotes the genomic average. Embryonic pSNMs were enriched in early-replicating regions, whereas clonal expansion pSNMs were enriched in late-replicating regions. (E-G) Proportion of pSNMs locating in open or closed chromatin regions in the HepG2 (E), HMEC (F), and K562 (G) cell-lines. Significantly higher proportions of embryonic pSNMs were observed in transcribed chromatin regions of all three cell types.</p