169 research outputs found

    Reducing INDEL calling errors in whole genome and exome sequencing data

    Get PDF
    BACKGROUND: INDELs, especially those disrupting protein-coding regions of the genome, have been strongly associated with human diseases. However, there are still many errors with INDEL variant calling, driven by library preparation, sequencing biases, and algorithm artifacts. METHODS: We characterized whole genome sequencing (WGS), whole exome sequencing (WES), and PCR-free sequencing data from the same samples to investigate the sources of INDEL errors. We also developed a classification scheme based on the coverage and composition to rank high and low quality INDEL calls. We performed a large-scale validation experiment on 600 loci, and find high-quality INDELs to have a substantially lower error rate than low-quality INDELs (7% vs. 51%). RESULTS: Simulation and experimental data show that assembly based callers are significantly more sensitive and robust for detecting large INDELs (>5 bp) than alignment based callers, consistent with published data. The concordance of INDEL detection between WGS and WES is low (53%), and WGS data uniquely identifies 10.8-fold more high-quality INDELs. The validation rate for WGS-specific INDELs is also much higher than that for WES-specific INDELs (84% vs. 57%), and WES misses many large INDELs. In addition, the concordance for INDEL detection between standard WGS and PCR-free sequencing is 71%, and standard WGS data uniquely identifies 6.3-fold more low-quality INDELs. Furthermore, accurate detection with Scalpel of heterozygous INDELs requires 1.2-fold higher coverage than that for homozygous INDELs. Lastly, homopolymer A/T INDELs are a major source of low-quality INDEL calls, and they are highly enriched in the WES data. CONCLUSIONS: Overall, we show that accuracy of INDEL detection with WGS is much greater than WES even in the targeted region. We calculated that 60X WGS depth of coverage from the HiSeq platform is needed to recover 95% of INDELs detected by Scalpel. While this is higher than current sequencing practice, the deeper coverage may save total project costs because of the greater accuracy and sensitivity. Finally, we investigate sources of INDEL errors (for example, capture deficiency, PCR amplification, homopolymers) with various data that will serve as a guideline to effectively reduce INDEL errors in genome sequencing

    Lifetime Prevalence, Age of Risk, and Etiology of Comorbid Psychiatric Disorders in Tourette Syndrome

    Get PDF
    IMPORTANCE: Tourette syndrome (TS) is characterized by high rates of psychiatric comorbidity; however, few studies have fully characterized these comorbidities. Furthermore, most studies have included relatively few participants (<200), and none has examined the ages of highest risk for each TS-associated comorbidity or their etiologic relationship to TS. OBJECTIVE: To characterize the lifetime prevalence, clinical associations, ages of highest risk, and etiology of psychiatric comorbidity among individuals with TS. DESIGN, SETTING, AND PARTICIPANTS: Cross-sectional structured diagnostic interviews conducted between April 1, 1992, and December 31, 2008, of participants with TS (n = 1374) and TS-unaffected family members (n = 1142). MAIN OUTCOMES AND MEASURES: Lifetime prevalence of comorbid DSM-IV-TR disorders, their heritabilities, ages of maximal risk, and associations with symptom severity, age at onset, and parental psychiatric history. RESULTS: The lifetime prevalence of any psychiatric comorbidity among individuals with TS was 85.7%; 57.7% of the population had 2 or more psychiatric disorders. The mean (SD) number of lifetime comorbid diagnoses was 2.1 (1.6); the mean number was 0.9 (1.3) when obsessive-compulsive disorder (OCD) and attention-deficit/hyperactivity disorder (ADHD) were excluded, and 72.1% of the individuals met the criteria for OCD or ADHD. Other disorders, including mood, anxiety, and disruptive behavior, each occurred in approximately 30% of the participants. The age of greatest risk for the onset of most comorbid psychiatric disorders was between 4 and 10 years, with the exception of eating and substance use disorders, which began in adolescence (interquartile range, 15–19 years for both). Tourette syndrome was associated with increased risk of anxiety (odds ratio [OR], 1.4; 95% CI, 1.0–1.9; P = .04) and decreased risk of substance use disorders (OR, 0.6; 95% CI, 0.3–0.9; P = .02) independent from comorbid OCD and ADHD; however, high rates of mood disorders among participants with TS (29.8%) may be accounted for by comorbid OCD (OR, 3.7; 95% CI, 2.9–4.8; P < .001). Parental history of ADHD was associated with a higher burden of non-OCD, non-ADHD comorbid psychiatric disorders (OR, 1.86; 95% CI, 1.32–2.61; P < .001). Genetic correlations between TS and mood (RhoG, 0.47), anxiety (RhoG, 0.35), and disruptive behavior disorders (RhoG, 0.48), may be accounted for by ADHD and, for mood disorders, by OCD. CONCLUSIONS AND RELEVANCE: This study is, to our knowledge, the most comprehensive of its kind. It confirms the belief that psychiatric comorbidities are common among individuals with TS, demonstrates that most comorbidities begin early in life, and indicates that certain comorbidities may be mediated by the presence of comorbid OCD or ADHD. In addition, genetic analyses suggest that some comorbidities may be more biologically related to OCD and/or ADHD rather than to TS

    Tapping the nucleotide pool of the host: novel nucleotide carrier proteins of Protochlamydia amoebophila

    Get PDF
    Protochlamydia amoebophila UWE25 is related to the Chlamydiaceae comprising major pathogens of humans, but thrives as obligate intracellular symbiont in the protozoan host Acanthamoeba sp. The genome of P. amoebophila encodes five paralogous carrier proteins belonging to the nucleotide transporter (NTT) family. Here we report on three P. amoebophila NTT isoforms, PamNTT2, PamNTT3 and PamNTT5, which possess several conserved amino acid residues known to be critical for nucleotide transport. We demonstrated that these carrier proteins are able to transport nucleotides, although substrate specificities and mode of transport differ in an unexpected manner and are unique among known NTTs. PamNTT2 is a counter exchange transporter exhibiting submillimolar apparent affinities for all four RNA nucleotides, PamNTT3 catalyses an unidirectional proton-coupled transport confined to UTP, whereas PamNTT5 mediates a proton-energized GTP and ATP import. All NTT genes of P. amoebophila are transcribed during intracellular multiplication in acanthamoebae. The biochemical characterization of all five NTT proteins from P. amoebophila in this and previous studies uncovered that these metabolically impaired bacteria are intimately connected with their host cell’s metabolism in a surprisingly complex manner

    Rare Copy Number Variants in \u3cem\u3eNRXN1\u3c/em\u3e and \u3cem\u3eCNTN6\u3c/em\u3e Increase Risk for Tourette Syndrome

    Get PDF
    Tourette syndrome (TS) is a model neuropsychiatric disorder thought to arise from abnormal development and/or maintenance of cortico-striato-thalamo-cortical circuits. TS is highly heritable, but its underlying genetic causes are still elusive, and no genome-wide significant loci have been discovered to date. We analyzed a European ancestry sample of 2,434 TS cases and 4,093 ancestry-matched controls for rare (\u3c 1% frequency) copy-number variants (CNVs) using SNP microarray data. We observed an enrichment of global CNV burden that was prominent for large (\u3e 1 Mb), singleton events (OR = 2.28, 95% CI [1.39–3.79], p = 1.2 × 10−3) and known, pathogenic CNVs (OR = 3.03 [1.85–5.07], p = 1.5 × 10−5). We also identified two individual, genome-wide significant loci, each conferring a substantial increase in TS risk (NRXN1 deletions, OR = 20.3, 95% CI [2.6–156.2]; CNTN6 duplications, OR = 10.1, 95% CI [2.3–45.4]). Approximately 1% of TS cases carry one of these CNVs, indicating that rare structural variation contributes significantly to the genetic architecture of TS

    Is Persistent Motor or Vocal Tic Disorder a Milder Form of Tourette Syndrome?

    Get PDF
    BACKGROUND: Persistent motor or vocal tic disorder (PMVT) has been hypothesized to be a forme fruste of Tourette syndrome (TS). Although the primary diagnostic criterion for PMVT (presence of motor or vocal tics, but not both) is clear, less is known about its clinical presentation. OBJECTIVE: The goals of this study were to compare the prevalence and number of comorbid psychiatric disorders, tic severity, age at tic onset, and family history for TS and PMVT. METHODS: We analyzed data from two independent cohorts using generalized linear equations and confirmed our findings using meta‐analyses, incorporating data from previously published literature. RESULTS: Rates of obsessive–compulsive disorder (OCD) and attention deficit hyperactivity disorder (ADHD) were lower in PMVT than in TS in all analyses. Other psychiatric comorbidities occurred with similar frequencies in PMVT and TS in both cohorts, although meta‐analyses suggested lower rates of most psychiatric disorders in PMVT compared with TS. ADHD and OCD increased the odds of comorbid mood, anxiety, substance use, and disruptive behaviors, and accounted for observed differences between PMVT and TS. Age of tic onset was approximately 2 years later, and tic severity was lower in PMVT than in TS. First‐degree relatives had elevated rates of TS, PMVT, OCD, and ADHD compared with population prevalences, with rates of TS equal to or greater than PMVT rates. CONCLUSIONS: Our findings support the hypothesis that PMVT and TS occur along a clinical spectrum in which TS is a more severe and PMVT a less severe manifestation of a continuous neurodevelopmental tic spectrum disorder. © 2021 The Authors. Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Societ

    Synaptic processes and immune-related pathways implicated in Tourette syndrome

    Get PDF
    Tourette syndrome (TS) is a neuropsychiatric disorder of complex genetic architecture involving multiple interacting genes. Here, we sought to elucidate the pathways that underlie the neurobiology of the disorder through genome-wide analysis. We analyzed genome-wide genotypic data of 3581 individuals with TS and 7682 ancestry-matched controls and investigated associations of TS with sets of genes that are expressed in particular cell types and operate in specific neuronal and glial functions. We employed a self-contained, set-based association method (SBA) as well as a competitive gene set method (MAGMA) using individual-level genotype data to perform a comprehensive investigation of the biological background of TS. Our SBA analysis identified three significant gene sets after Bonferroni correction, implicating ligand-gated ion channel signaling, lymphocytic, and cell adhesion and transsynaptic signaling processes. MAGMA analysis further supported the involvement of the cell adhesion and trans-synaptic signaling gene set. The lymphocytic gene set was driven by variants in FLT3, raising an intriguing hypothesis for the involvement of a neuroinflammatory element in TS pathogenesis. The indications of involvement of ligand-gated ion channel signaling reinforce the role of GABA in TS, while the association of cell adhesion and trans-synaptic signaling gene set provides additional support for the role of adhesion molecules in neuropsychiatric disorders. This study reinforces previous findings but also provides new insights into the neurobiology of TS

    Synaptic processes and immune-related pathways implicated in Tourette syndrome

    Get PDF
    Tourette syndrome (TS) is a neuropsychiatric disorder of complex genetic architecture involving multiple interacting genes. Here, we sought to elucidate the pathways that underlie the neurobiology of the disorder through genome-wide analysis. We analyzed genome-wide genotypic data of 3581 individuals with TS and 7682 ancestry-matched controls and investigated associations of TS with sets of genes that are expressed in particular cell types and operate in specific neuronal and glial functions. We employed a self-contained, set-based association method (SBA) as well as a competitive gene set method (MAGMA) using individual-level genotype data to perform a comprehensive investigation of the biological background of TS. Our SBA analysis identified three significant gene sets after Bonferroni correction, implicating ligand-gated ion channel signaling, lymphocytic, and cell adhesion and transsynaptic signaling processes. MAGMA analysis further supported the involvement of the cell adhesion and trans-synaptic signaling gene set. The lymphocytic gene set was driven by variants in FLT3, raising an intriguing hypothesis for the involvement of a neuroinflammatory element in TS pathogenesis. The indications of involvement of ligand-gated ion channel signaling reinforce the role of GABA in TS, while the association of cell adhesion and trans-synaptic signaling gene set provides additional support for the role of adhesion molecules in neuropsychiatric disorders. This study reinforces previous findings but also provides new insights into the neurobiology of TS

    An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge

    Get PDF
    There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. RESULTS: A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization. CONCLUSIONS: The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups
    corecore