54 research outputs found

    A robust clustering algorithm for identifying problematic samples in genome-wide association studies

    Get PDF
    Summary: High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections

    Author Correction: Cross-ancestry genome-wide association analysis of corneal thickness strengthens link between complex and Mendelian eye diseases.

    Get PDF
    Emmanuelle Souzeau, who contributed to analysis of data, was inadvertently omitted from the author list in the originally published version of this Article. This has now been corrected in both the PDF and HTML versions of the Article

    Polymorphism in a lincRNA Associates with a Doubled Risk of Pneumococcal Bacteremia in Kenyan Children.

    Get PDF
    Bacteremia (bacterial bloodstream infection) is a major cause of illness and death in sub-Saharan Africa but little is known about the role of human genetics in susceptibility. We conducted a genome-wide association study of bacteremia susceptibility in more than 5,000 Kenyan children as part of the Wellcome Trust Case Control Consortium 2 (WTCCC2). Both the blood-culture-proven bacteremia case subjects and healthy infants as controls were recruited from Kilifi, on the east coast of Kenya. Streptococcus pneumoniae is the most common cause of bacteremia in Kilifi and was thus the focus of this study. We identified an association between polymorphisms in a long intergenic non-coding RNA (lincRNA) gene (AC011288.2) and pneumococcal bacteremia and replicated the results in the same population (p combined = 1.69 × 10(-9); OR = 2.47, 95% CI = 1.84-3.31). The susceptibility allele is African specific, derived rather than ancestral, and occurs at low frequency (2.7% in control subjects and 6.4% in case subjects). Our further studies showed AC011288.2 expression only in neutrophils, a cell type that is known to play a major role in pneumococcal clearance. Identification of this novel association will further focus research on the role of lincRNAs in human infectious disease.Wellcome Trust (Grant ID: 084716/Z/08/Z)This is the final version of the article. It first appeared from Cell Press/Elsevier via http://dx.doi.org/10.1016/j.ajhg.2016.03.02

    An inherited duplication at the gene p21 protein-activated Kinase 7 (PAK7) is a risk factor for psychosis

    Get PDF
    FUNDING Funding for this study was provided by the Wellcome Trust Case Control Consortium 2 project (085475/B/08/Z and 085475/Z/08/Z), the Wellcome Trust (072894/Z/03/Z, 090532/Z/09/Z and 075491/Z/04/B), NIMH grants (MH 41953 and MH083094) and Science Foundation Ireland (08/IN.1/B1916). We acknowledge use of the Trinity Biobank sample from the Irish Blood Transfusion Service; the Trinity Centre for High Performance Computing; British 1958 Birth Cohort DNA collection funded by the Medical Research Council (G0000934) and the Wellcome Trust (068545/Z/02) and of the UK National Blood Service controls funded by the Wellcome Trust. Chris Spencer is supported by a Wellcome Trust Career Development Fellowship (097364/Z/11/Z). Funding to pay the Open Access publication charges for this article was provided by the Wellcome Trust. ACKNOWLEDGEMENTS The authors sincerely thank all patients who contributed to this study and all staff who facilitated their involvement. We thank W. Bodmer and B. Winney for use of the People of the British Isles DNA collection, which was funded by the Wellcome Trust. We thank Akira Sawa and Koko Ishzuki for advice on the PAK7–DISC1 interaction experiment and Jan Korbel for discussions on mechanism of structural variation.Peer reviewedPublisher PD

    Genome-wide association studies in oesophageal adenocarcinoma and Barrett's oesophagus: a large-scale meta-analysis.

    Get PDF
    BACKGROUND: Oesophageal adenocarcinoma represents one of the fastest rising cancers in high-income countries. Barrett's oesophagus is the premalignant precursor of oesophageal adenocarcinoma. However, only a few patients with Barrett's oesophagus develop adenocarcinoma, which complicates clinical management in the absence of valid predictors. Within an international consortium investigating the genetics of Barrett's oesophagus and oesophageal adenocarcinoma, we aimed to identify novel genetic risk variants for the development of Barrett's oesophagus and oesophageal adenocarcinoma. METHODS: We did a meta-analysis of all genome-wide association studies of Barrett's oesophagus and oesophageal adenocarcinoma available in PubMed up to Feb 29, 2016; all patients were of European ancestry and disease was confirmed histopathologically. All participants were from four separate studies within Europe, North America, and Australia and were genotyped on high-density single nucleotide polymorphism (SNP) arrays. Meta-analysis was done with a fixed-effects inverse variance-weighting approach and with a standard genome-wide significance threshold (p<5 × 10-8). We also did an association analysis after reweighting of loci with an approach that investigates annotation enrichment among genome-wide significant loci. Furthermore, the entire dataset was analysed with bioinformatics approaches-including functional annotation databases and gene-based and pathway-based methods-to identify pathophysiologically relevant cellular mechanisms. FINDINGS: Our sample comprised 6167 patients with Barrett's oesophagus and 4112 individuals with oesophageal adenocarcinoma, in addition to 17 159 representative controls from four genome-wide association studies in Europe, North America, and Australia. We identified eight new risk loci associated with either Barrett's oesophagus or oesophageal adenocarcinoma, within or near the genes CFTR (rs17451754; p=4·8 × 10-10), MSRA (rs17749155; p=5·2 × 10-10), LINC00208 and BLK (rs10108511; p=2·1 × 10-9), KHDRBS2 (rs62423175; p=3·0 × 10-9), TPPP and CEP72 (rs9918259; p=3·2 × 10-9), TMOD1 (rs7852462; p=1·5 × 10-8), SATB2 (rs139606545; p=2·0 × 10-8), and HTR3C and ABCC5 (rs9823696; p=1·6 × 10-8). The locus identified near HTR3C and ABCC5 (rs9823696) was associated specifically with oesophageal adenocarcinoma (p=1·6 × 10-8) and was independent of Barrett's oesophagus development (p=0·45). A ninth novel risk locus was identified within the gene LPA (rs12207195; posterior probability 0·925) after reweighting with significantly enriched annotations. The strongest disease pathways identified (p<10-6) belonged to muscle cell differentiation and to mesenchyme development and differentiation. INTERPRETATION: Our meta-analysis of genome-wide association studies doubled the number of known risk loci for Barrett's oesophagus and oesophageal adenocarcinoma and revealed new insights into causes of these diseases. Furthermore, the specific association between oesophageal adenocarcinoma and the locus near HTR3C and ABCC5 might constitute a novel genetic marker for prediction of the transition from Barrett's oesophagus to oesophageal adenocarcinoma. Fine-mapping and functional studies of new risk loci could lead to identification of key molecules in the development of Barrett's oesophagus and oesophageal adenocarcinoma, which might encourage development of advanced prevention and intervention strategies. FUNDING: US National Cancer Institute, US National Institutes of Health, National Health and Medical Research Council of Australia, Swedish Cancer Society, Medical Research Council UK, Cambridge NIHR Biomedical Research Centre, Cambridge Experimental Cancer Medicine Centre, Else Kröner Fresenius Stiftung, Wellcome Trust, Cancer Research UK, AstraZeneca UK, University Hospitals of Leicester, University of Oxford, Australian Research Council

    A Two-Stage Meta-Analysis Identifies Several New Loci for Parkinson's Disease

    Get PDF
    A previous genome-wide association (GWA) meta-analysis of 12,386 PD cases and 21,026 controls conducted by the International Parkinson's Disease Genomics Consortium (IPDGC) discovered or confirmed 11 Parkinson's disease (PD) loci. This first analysis of the two-stage IPDGC study focused on the set of loci that passed genome-wide significance in the first stage GWA scan. However, the second stage genotyping array, the ImmunoChip, included a larger set of 1,920 SNPs selected on the basis of the GWA analysis. Here, we analyzed this set of 1,920 SNPs, and we identified five additional PD risk loci (combined p<5x10(-10), PARK16/1q32, STX1B/16p11, FGF20/8p22, STBD1/4q21, and GPNMB/7p15). Two of these five loci have been suggested by previous association studies (PARK16/1q32, FGF20/8p22), and this study provides further support for these findings. Using a dataset of post-mortem brain samples assayed for gene expression (n = 399) and methylation (n = 292), we identified methylation and expression changes associated with PD risk variants in PARK16/1q32, GPNMB/7p15, and STX1B/16p11 loci, hence suggesting potential molecular mechanisms and candidate genes at these risk loci

    The Irish DNA Atlas: Revealing Fine-Scale Population Structure and History within Ireland

    Get PDF
    The extent of population structure within Ireland is largely unknown, as is the impact of historical migrations. Here we illustrate fine-scale genetic structure across Ireland that follows geographic boundaries and present evidence of admixture events into Ireland. Utilising the ‘Irish DNA Atlas’, a cohort (n = 194) of Irish individuals with four generations of ancestry linked to specific regions in Ireland, in combination with 2,039 individuals from the Peoples of the British Isles dataset, we show that the Irish population can be divided in 10 distinct geographically stratified genetic clusters; seven of ‘Gaelic’ Irish ancestry, and three of shared Irish-British ancestry. In addition we observe a major genetic barrier to the north of Ireland in Ulster. Using a reference of 6,760 European individuals and two ancient Irish genomes, we demonstrate high levels of North-West French-like and West Norwegian-like ancestry within Ireland. We show that that our ‘Gaelic’ Irish clusters present homogenous levels of ancient Irish ancestries. We additionally detect admixture events that provide evidence of Norse-Viking gene flow into Ireland, and reflect the Ulster Plantations. Our work informs both on Irish history, as well as the study of Mendelian and complex disease genetics involving populations of Irish ancestry

    Resolving the polymorphism-in-probe problem is critical for correct interpretation of expression QTL studies.

    Get PDF
    Polymorphisms in the target mRNA sequence can greatly affect the binding affinity of microarray probe sequences, leading to false-positive and false-negative expression quantitative trait locus (QTL) signals with any other polymorphisms in linkage disequilibrium. We provide the most complete solution to this problem, by using the latest genome and exome sequence reference data to identify almost all common polymorphisms (frequency >1% in Europeans) in probe sequences for two commonly used microarray panels (the gene-based Illumina Human HT12 array, which uses 50-mer probes, and exon-based Affymetrix Human Exon 1.0 ST array, which uses 25-mer probes). We demonstrate the impact of this problem using cerebellum and frontal cortex tissues from 438 neuropathologically normal individuals. We find that although only a small proportion of the probes contain polymorphisms, they account for a large proportion of apparent expression QTL signals, and therefore result in many false signals being declared as real. We find that the polymorphism-in-probe problem is insufficiently controlled by previous protocols, and illustrate this using some notable false-positive and false-negative examples in MAPT and PRICKLE1 that can be found in many eQTL databases. We recommend that both new and existing eQTL data sets should be carefully checked in order to adequately address this issue
    corecore