725 research outputs found

    Validation and assessment of variant calling pipelines for next-generation sequencing

    Get PDF
    Background: The processing and analysis of the large scale data generated by next-generation sequencing (NGS) experiments is challenging and is a burgeoning area of new methods development. Several new bioinformatics tools have been developed for calling sequence variants from NGS data. Here, we validate the variant calling of these tools and compare their relative accuracy to determine which data processing pipeline is optimal. Results: We developed a unified pipeline for processing NGS data that encompasses four modules: mapping, filtering, realignment and recalibration, and variant calling. We processed 130 subjects from an ongoing whole exome sequencing study through this pipeline. To evaluate the accuracy of each module, we conducted a series of comparisons between the single nucleotide variant (SNV) calls from the NGS data and either gold-standard Sanger sequencing on a total of 700 variants or array genotyping data on a total of 9,935 single-nucleotide polymorphisms. A head to head comparison showed that Genome Analysis Toolkit (GATK) provided more accurate calls than SAMtools (positive predictive value of 92.55% vs. 80.35%, respectively). Realignment of mapped reads and recalibration of base quality scores before SNV calling proved to be crucial to accurate variant calling. GATK HaplotypeCaller algorithm for variant calling outperformed the UnifiedGenotype algorithm. We also showed a relationship between mapping quality, read depth and allele balance, and SNV call accuracy. However, if best practices are used in data processing, then additional filtering based on these metrics provides little gains and accuracies of >99% are achievable. Conclusions: Our findings will help to determine the best approach for processing NGS data to confidently call variants for downstream analyses. To enable others to implement and replicate our results, all of our codes are freely available at http://metamoodics.org/wes

    A Hybrid Likelihood Model for Sequence-Based Disease Association Studies

    Get PDF
    In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical single-marker association analysis for rare variants has been a challenge in such studies. A new generation of statistical methods for case-control association studies has been developed to meet this challenge. A common approach to association analysis of rare variants is the burden-style collapsing methods to combine rare variant data within individuals across or within genes. Here, we propose a new hybrid likelihood model that combines a burden test with a test of the position distribution of variants. In extensive simulations and on empirical data from the Dallas Heart Study, the new model demonstrates consistently good power, in particular when applied to a gene set (e.g., multiple candidate genes with shared biological function or pathway), when rare variants cluster in key functional regions of a gene, and when protective variants are present. When applied to data from an ongoing sequencing study of bipolar disorder (191 cases, 107 controls), the model identifies seven gene sets with nominal p-values<0.05, of which one MAPK signaling pathway (KEGG) reaches trend-level significance after correcting for multiple testing. © 2013 Chen et al

    Genome-wide linkage analysis of 972 bipolar pedigrees using single-nucleotide polymorphisms.

    Get PDF
    Because of the high costs associated with ascertainment of families, most linkage studies of Bipolar I disorder (BPI) have used relatively small samples. Moreover, the genetic information content reported in most studies has been less than 0.6. Although microsatellite markers spaced every 10 cM typically extract most of the genetic information content for larger multiplex families, they can be less informative for smaller pedigrees especially for affected sib pair kindreds. For these reasons we collaborated to pool family resources and carried out higher density genotyping. Approximately 1100 pedigrees of European ancestry were initially selected for study and were genotyped by the Center for Inherited Disease Research using the Illumina Linkage Panel 12 set of 6090 single-nucleotide polymorphisms. Of the ~1100 families, 972 were informative for further analyses, and mean information content was 0.86 after pruning for linkage disequilibrium. The 972 kindreds include 2284 cases of BPI disorder, 498 individuals with bipolar II disorder (BPII) and 702 subjects with recurrent major depression. Three affection status models (ASMs) were considered: ASM1 (BPI and schizoaffective disorder, BP cases (SABP) only), ASM2 (ASM1 cases plus BPII) and ASM3 (ASM2 cases plus recurrent major depression). Both parametric and non-parametric linkage methods were carried out. The strongest findings occurred at 6q21 (non-parametric pairs LOD 3.4 for rs1046943 at 119 cM) and 9q21 (non-parametric pairs logarithm of odds (LOD) 3.4 for rs722642 at 78 cM) using only BPI and schizoaffective (SA), BP cases. Both results met genome-wide significant criteria, although neither was significant after correction for multiple analyses. We also inspected parametric scores for the larger multiplex families to identify possible rare susceptibility loci. In this analysis, we observed 59 parametric LODs of 2 or greater, many of which are likely to be close to maximum possible scores. Although some linkage findings may be false positives, the results could help prioritize the search for rare variants using whole exome or genome sequencing

    Are genetic risk factors for psychosis also associated with dimension-specific psychotic experiences in adolescence?

    Get PDF
    Psychosis has been hypothesised to be a continuously distributed quantitative phenotype and disorders such as schizophrenia and bipolar disorder represent its extreme manifestations. Evidence suggests that common genetic variants play an important role in liability to both schizophrenia and bipolar disorder. Here we tested the hypothesis that these common variants would also influence psychotic experiences measured dimensionally in adolescents in the general population. Our aim was to test whether schizophrenia and bipolar disorder polygenic risk scores (PRS), as well as specific single nucleotide polymorphisms (SNPs) previously identified as risk variants for schizophrenia, were associated with adolescent dimension-specific psychotic experiences. Self-reported Paranoia, Hallucinations, Cognitive Disorganisation, Grandiosity, Anhedonia, and Parent-rated Negative Symptoms, as measured by the Specific Psychotic Experiences Questionnaire (SPEQ), were assessed in a community sample of 2,152 16-year-olds. Polygenic risk scores were calculated using estimates of the log of odds ratios from the Psychiatric Genomics Consortium GWAS stage-1 mega-analysis of schizophrenia and bipolar disorder. The polygenic risk analyses yielded no significant associations between schizophrenia and bipolar disorder PRS and the SPEQ measures. The analyses on the 28 individual SNPs previously associated with schizophrenia found that two SNPs in TCF4 returned a significant association with the SPEQ Paranoia dimension, rs17512836 (p-value=2.57x10-4) and rs9960767 (p-value=6.23x10-4). Replication in an independent sample of 16-year-olds (N=3,427) assessed using the Psychotic-Like Symptoms Questionnaire (PLIKS-Q), a composite measure of multiple positive psychotic experiences, failed to yield significant results. Future research with PRS derived from larger samples, as well as larger adolescent validation samples, would improve the predictive power to test these hypotheses further. The challenges of relating adult clinical diagnostic constructs such as schizophrenia to adolescent psychotic experiences at a genetic level are discussed

    Mental health literacy of depression: gender differences and attitudinal antecedents in a representative British sample

    Get PDF
    Background Poor mental health literacy and negative attitudes toward individuals with mental health disorders may impede optimal help-seeking for symptoms of mental ill-health. The present study examined the ability to recognize cases of depression as a function of respondent and target gender, as well as individual psychological differences in attitudes toward persons with depression. Methods In a representative British general population survey, the ability to correctly recognize vignettes of depression was assessed among 1,218 adults. Respondents also rated the vignettes along a number of attitudinal dimensions and completed measures of attitudes toward seeking psychological help, psychiatric skepticism, and anti-scientific attitudes. Results There were significant differences in the ability to correctly identify cases of depression as a function of respondent and target gender. Respondents were more likely to indicate that a male vignette did not suffer from a mental health disorder compared to a female vignette, and women were more likely than men to indicate that the male vignette suffered from a mental health disorder. Attitudes toward persons with depression were associated with attitudes toward seeking psychological help, psychiatric skepticism, and anti-scientific attitudes. Conclusion Initiatives that consider the impact of gender stereotypes as well as individual differences may enhance mental health literacy, which in turn is associated with improved help-seeking behaviors for symptoms of mental ill-health

    Mapping gene associations in human mitochondria using clinical disease phenotypes

    Get PDF
    Nuclear genes encode most mitochondrial proteins, and their mutations cause diverse and debilitating clinical disorders. To date, 1,200 of these mitochondrial genes have been recorded, while no standardized catalog exists of the associated clinical phenotypes. Such a catalog would be useful to develop methods to analyze human phenotypic data, to determine genotype-phenotype relations among many genes and diseases, and to support the clinical diagnosis of mitochondrial disorders. Here we establish a clinical phenotype catalog of 174 mitochondrial disease genes and study associations of diseases and genes. Phenotypic features such as clinical signs and symptoms were manually annotated from full-text medical articles and classified based on the hierarchical MeSH ontology. This classification of phenotypic features of each gene allowed for the comparison of diseases between different genes. In turn, we were then able to measure the phenotypic associations of disease genes for which we calculated a quantitative value that is based on their shared phenotypic features. The results showed that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. We then constructed a functional network of mitochondrial genes and discovered a higher connectivity for non-disease than for disease genes, and a tendency of disease genes to interact with each other. Utilizing these differences, we propose 168 candidate genes that resemble the characteristic interaction patterns of mitochondrial disease genes. Through their network associations, the candidates are further prioritized for the study of specific disorders such as optic neuropathies and Parkinson disease. Most mitochondrial disease phenotypes involve several clinical categories including neurologic, metabolic, and gastrointestinal disorders, which might indicate the effects of gene defects within the mitochondrial system. The accompanying knowledgebase (http://www.mitophenome.org/) supports the study of clinical diseases and associated genes

    Genome-wide Association Study of Borderline Personality Disorder Reveals Genetic Overlap with Bipolar Disorder, Major Depression and Schizophrenia

    Get PDF
    Borderline personality disorder (BOR) is determined by environmental and genetic factors, and characterized by affective instability and impulsivity, diagnostic symptoms also observed in manic phases of bipolar disorder (BIP). Up to 20% of BIP patients show comorbidity with BOR. This report describes the first case–control genome-wide association study (GWAS) of BOR, performed in one of the largest BOR patient samples worldwide. The focus of our analysis was (i) to detect genes and gene sets involved in BOR and (ii) to investigate the genetic overlap with BIP. As there is considerable genetic overlap between BIP, major depression (MDD) and schizophrenia (SCZ) and a high comorbidity of BOR and MDD, we also analyzed the genetic overlap of BOR with SCZ and MDD. GWAS, gene-based tests and gene-set analyses were performed in 998 BOR patients and 1545 controls. Linkage disequilibrium score regression was used to detect the genetic overlap between BOR and these disorders. Single marker analysis revealed no significant association after correction for multiple testing. Gene-based analysis yielded two significant genes: DPYD (P=4.42 × 10−7) and PKP4 (P=8.67 × 10−7); and gene-set analysis yielded a significant finding for exocytosis (GO:0006887, PFDR=0.019; FDR, false discovery rate). Prior studies have implicated DPYD, PKP4 and exocytosis in BIP and SCZ. The most notable finding of the present study was the genetic overlap of BOR with BIP (rg=0.28 [P=2.99 × 10−3]), SCZ (rg=0.34 [P=4.37 × 10−5]) and MDD (rg=0.57 [P=1.04 × 10−3]). We believe our study is the first to demonstrate that BOR overlaps with BIP, MDD and SCZ on the genetic level. Whether this is confined to transdiagnostic clinical symptoms should be examined in future studies

    Genome-wide association for major depression through age at onset stratification

    Get PDF
    BACKGROUND: Major depressive disorder (MDD) is a disabling mood disorder, and despite a known heritable component, a large meta-analysis of genome-wide association studies revealed no replicable genetic risk variants. Given prior evidence of heterogeneity by age at onset in MDD, we tested whether genome-wide significant risk variants for MDD could be identified in cases subdivided by age at onset. METHODS: Discovery case-control genome-wide association studies were performed where cases were stratified using increasing/decreasing age-at-onset cutoffs; significant single nucleotide polymorphisms were tested in nine independent replication samples, giving a total sample of 22,158 cases and 133,749 control subjects for subsetting. Polygenic score analysis was used to examine whether differences in shared genetic risk exists between earlier and adult-onset MDD with commonly comorbid disorders of schizophrenia, bipolar disorder, Alzheimer’s disease, and coronary artery disease. RESULTS: We identified one replicated genome-wide significant locus associated with adult-onset (>27 years) MDD (rs7647854, odds ratio: 1.16, 95% confidence interval: 1.11–1.21, p = 5.2 × 10-11). Using polygenic score analyses, we show that earlier-onset MDD is genetically more similar to schizophrenia and bipolar disorder than adult-onset MDD. CONCLUSIONS: We demonstrate that using additional phenotype data previously collected by genetic studies to tackle phenotypic heterogeneity in MDD can successfully lead to the discovery of genetic risk factor despite reduced sample size. Furthermore, our results suggest that the genetic susceptibility to MDD differs between adult- and earlier-onset MDD, with earlier-onset cases having a greater genetic overlap with schizophrenia and bipolar disorder

    Vocal Learning and Auditory-Vocal Feedback

    Get PDF
    Vocal learning is usually studied in songbirds and humans, species that can form auditory templates by listening to acoustic models and then learn to vocalize to match the template. Most other species are thought to develop vocalizations without auditory feedback. However, auditory input influences the acoustic structure of vocalizations in a broad distribution of birds and mammals. Vocalizations are dened here as sounds generated by forcing air past vibrating membranes. A vocal motor program may generate vocalizations such as crying or laughter, but auditory feedback may be required for matching precise acoustic features of vocalizations. This chapter discriminates limited vocal learning, which uses auditory input to fine-tune acoustic features of an inherited auditory template, from complex vocal learning, in which novel sounds are learned by matching a learned auditory template. Two or three songbird taxa and four or ve mammalian taxa are known for complex vocal learning. A broader range of mammals converge in the acoustic structure of vocalizations when in socially interacting groups, which qualifies as limited vocal learning. All birds and mammals tested use auditory-vocal feedback to adjust their vocalizations to compensate for the effects of noise, and many species modulate their signals as the costs and benefits of communicating vary. This chapter asks whether some auditory-vocal feedback may have provided neural substrates for the evolution of vocal learning. Progress will require more precise definitions of different forms of vocal learning, broad comparative review of their presence and absence, and behavioral and neurobiological investigations into the mechanisms underlying the skills.PostprintPeer reviewe
    corecore