21 research outputs found

    popSTR2 enables clinical and population-scale genotyping of microsatellites

    Get PDF
    Summary: popSTR2 is an update and augmentation of our previous work ‘popSTR: a population-based microsatellite genotyper’. To make genotyping sensitive to inter-sample differences, we supply a kernel to estimate sample-specific slippage rates. For clinical sequencing purposes, a panel of known pathogenic repeat expansions is provided along with a script that scans and flags for manual inspection markers indicative of a pathogenic expansion. Like its predecessor, popSTR2 allows for joint genotyping of samples at a population scale. We now provide a binning method that makes the microsatellite genotypes more amenable to analysis within standard association pipelines and can increase association power. Availability and implementation: https://github.com/DecodeGenetics/popSTR. Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.Peer Reviewed (ritrýnd grein

    PopDel identifies medium-size deletions simultaneously in tens of thousands of genomes

    Get PDF
    Thousands of genomic structural variants (SVs) segregate in the human population and can impact phenotypic traits and diseases. Their identification in whole-genome sequence data of large cohorts is a major computational challenge. Most current approaches identify SVs in single genomes and afterwards merge the identified variants into a joint call set across many genomes. We describe the approach PopDel, which directly identifies deletions of about 500 to at least 10,000 bp in length in data of many genomes jointly, eliminating the need for subsequent variant merging. PopDel scales to tens of thousands of genomes as we demonstrate in evaluations on up to 49,962 genomes. We show that PopDel reliably reports common, rare and de novo deletions. On genomes with available high-confidence reference call sets PopDel shows excellent recall and precision. Genotype inheritance patterns in up to 6794 trios indicate that genotypes predicted by PopDel are more reliable than those of previous SV callers. Furthermore, PopDel’s running time is competitive with the fastest tested previous tools. The demonstrated scalability and accuracy of PopDel enables routine scans for deletions in large-scale sequencing studies

    The genetic architecture of age-related hearing impairment revealed by genome-wide association analysis.

    Get PDF
    To access publisher's full text version of this article, please click on the hyperlink in Additional Links field or click on the hyperlink at the top of the page marked DownloadAge-related hearing impairment (ARHI) is the most common sensory disorder in older adults. We conducted a genome-wide association meta-analysis of 121,934 ARHI cases and 591,699 controls from Iceland and the UK. We identified 21 novel sequence variants, of which 13 are rare, under either additive or recessive models. Of special interest are a missense variant in LOXHD1 (MAF = 1.96%) and a tandem duplication in FBF1 covering 4 exons (MAF = 0.22%) associating with ARHI (OR = 3.7 for homozygotes, P = 1.7 × 10-22 and OR = 4.2 for heterozygotes, P = 5.7 × 10-27, respectively). We constructed an ARHI genetic risk score (GRS) using common variants and showed that a common variant GRS can identify individuals at risk comparable to carriers of rare high penetrance variants. Furthermore, we found that ARHI and tinnitus share genetic causes. This study sheds a new light on the genetic architecture of ARHI, through several rare variants in both Mendelian deafness genes and genes not previously linked to hearing

    The sequences of 150,119 genomes in the UK Biobank

    Get PDF
    Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data(1,2). Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank(3). This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation

    Genome-wide association identifies seven loci for pelvic organ prolapse in Iceland and the UK Biobank.

    Get PDF
    To access publisher's full text version of this article, please click on the hyperlink in Additional Links field or click on the hyperlink at the top of the page marked DownloadPelvic organ prolapse (POP) is a downward descent of one or more of the pelvic organs, resulting in a protrusion of the vaginal wall and/or uterus. We performed a genome-wide association study of POP using data from Iceland and the UK Biobank, a total of 15,010 cases with hospital-based diagnosis code and 340,734 female controls, and found eight sequence variants at seven loci associating with POP (P 5%) and one with minor allele frequency of 4.87%. Some of the variants associating with POP also associated with traits of similar pathophysiology. Of these, rs3820282, which may alter the estrogen-based regulation of WNT4, also associates with leiomyoma of uterus, gestational duration and endometriosis. Rs3791675 at EFEMP1, a gene involved in connective tissue homeostasis, also associates with hernias and carpal tunnel syndrome. Our results highlight the role of connective tissue metabolism and estrogen exposure in the etiology of POP.UCL Hospitals NIHR Biomedical Research Centr

    A genome-wide meta-analysis yields 46 new loci associating with biomarkers of iron homeostasis

    Get PDF
    Bell et al. report 46 new loci associated with biomarkers of iron homeostasis, including ferritin levels, iron binding capacity, and iron saturation, in the Icelandic, Danish and UK populations. The associated loci point to new iron-regulating proteins and important genetic differences between men and women

    HLA alleles, disease severity, and age associate with T-cell responses following infection with SARS-CoV-2

    Get PDF
    Funding Information: We thank all of the participants that contributed samples for this study for their invaluable contribution to the research. We also thank our research staff at the Patient Recruitment Center for their thorough work. Publisher Copyright: © 2022, The Author(s).Memory T-cell responses following SARS-CoV-2 infection have been extensively investigated but many studies have been small with a limited range of disease severity. Here we analyze SARS-CoV-2 reactive T-cell responses in 768 convalescent SARS-CoV-2-infected (cases) and 500 uninfected (controls) Icelanders. The T-cell responses are stable three to eight months after SARS-CoV-2 infection, irrespective of disease severity and even those with the mildest symptoms induce broad and persistent T-cell responses. Robust CD4+ T-cell responses are detected against all measured proteins (M, N, S and S1) while the N protein induces strongest CD8+ T-cell responses. CD4+ T-cell responses correlate with disease severity, humoral responses and age, whereas CD8+ T-cell responses correlate with age and functional antibodies. Further, CD8+ T-cell responses associate with several class I HLA alleles. Our results, provide new insight into HLA restriction of CD8+ T-cell immunity and other factors contributing to heterogeneity of T-cell responses following SARS-CoV-2 infection.Peer reviewe

    Identification of Lynch syndrome risk variants in the Romanian population.

    Get PDF
    To access publisher's full text version of this article, please click on the hyperlink in Additional Links field or click on the hyperlink at the top of the page marked DownloadTwo familial forms of colorectal cancer (CRC), Lynch syndrome (LS) and familial adenomatous polyposis (FAP), are caused by rare mutations in DNA mismatch repair genes (MLH1, MSH2, MSH6, PMS2) and the genes APC and MUTYH, respectively. No information is available on the presence of high-risk CRC mutations in the Romanian population. We performed whole-genome sequencing of 61 Romanian CRC cases with a family history of cancer and/or early onset of disease, focusing the analysis on candidate variants in the LS and FAP genes. The frequencies of all candidate variants were assessed in a cohort of 688 CRC cases and 4567 controls. Immunohistochemical (IHC) staining for MLH1, MSH2, MSH6, and PMS2 was performed on tumour tissue. We identified 11 candidate variants in 11 cases; six variants in MLH1, one in MSH6, one in PMS2, and three in APC. Combining information on the predicted impact of the variants on the proteins, IHC results and previous reports, we found three novel pathogenic variants (MLH1:p.Lys84ThrfsTer4, MLH1:p.Ala586CysfsTer7, PMS2:p.Arg211ThrfsTer38), and two novel variants that are unlikely to be pathogenic. Also, we confirmed three previously published pathogenic LS variants and suggest to reclassify a previously reported variant of uncertain significance to pathogenic (MLH1:c.1559-1G>C).European Union EE

    A genome-wide meta-analysis yields 46 new loci associating with biomarkers of iron homeostasis

    Get PDF
    Abstract: Iron is essential for many biological functions and iron deficiency and overload have major health implications. We performed a meta-analysis of three genome-wide association studies from Iceland, the UK and Denmark of blood levels of ferritin (N = 246,139), total iron binding capacity (N = 135,430), iron (N = 163,511) and transferrin saturation (N = 131,471). We found 62 independent sequence variants associating with iron homeostasis parameters at 56 loci, including 46 novel loci. Variants at DUOX2, F5, SLC11A2 and TMPRSS6 associate with iron deficiency anemia, while variants at TF, HFE, TFR2 and TMPRSS6 associate with iron overload. A HBS1L-MYB intergenic region variant associates both with increased risk of iron overload and reduced risk of iron deficiency anemia. The DUOX2 missense variant is present in 14% of the population, associates with all iron homeostasis biomarkers, and increases the risk of iron deficiency anemia by 29%. The associations implicate proteins contributing to the main physiological processes involved in iron homeostasis: iron sensing and storage, inflammation, absorption of iron from the gut, iron recycling, erythropoiesis and bleeding/menstruation

    Graphtyper enables population-scale genotyping using pangenome graphs.

    No full text
    To access publisher's full text version of this article click on the hyperlink belowA fundamental requirement for genetic studies is an accurate determination of sequence variation. While human genome sequence diversity is increasingly well characterized, there is a need for efficient ways to use this knowledge in sequence analysis. Here we present Graphtyper, a publicly available novel algorithm and software for discovering and genotyping sequence variants. Graphtyper realigns short-read sequence data to a pangenome, a variation-aware graph structure that encodes sequence variation within a population by representing possible haplotypes as graph paths. Our results show that Graphtyper is fast, highly scalable, and provides sensitive and accurate genotype calls. Graphtyper genotyped 89.4 million sequence variants in the whole genomes of 28,075 Icelanders using less than 100,000 CPU days, including detailed genotyping of six human leukocyte antigen (HLA) genes. We show that Graphtyper is a valuable tool in characterizing sequence variation in both small and population-scale sequencing studies
    corecore