62 research outputs found

    NovoGraph: Human genome graph construction from multiple long-read de novo assemblies [version 2; referees: 2 approved]

    Get PDF
    Genome graphs are emerging as an important novel approach to the analysis of high-throughput human sequencing data. By explicitly representing genetic variants and alternative haplotypes in a mappable data structure, they can enable the improved analysis of structurally variable and hyperpolymorphic regions of the genome. In most existing approaches, graphs are constructed from variant call sets derived from short-read sequencing. As long-read sequencing becomes more cost-effective and enables de novo assembly for increasing numbers of whole genomes, a method for the direct construction of a genome graph from sets of assembled human genomes would be desirable. Such assembly-based genome graphs would encompass the wide spectrum of genetic variation accessible to long-read-based de novo assembly, including large structural variants and divergent haplotypes. Here we present NovoGraph, a method for the construction of a human genome graph directly from a set of de novo assemblies. NovoGraph constructs a genome-wide multiple sequence alignment of all input contigs and creates a graph by merging the input sequences at positions that are both homologous and sequence-identical. NovoGraph outputs resulting graphs in VCF format that can be loaded into third-party genome graph toolkits. To demonstrate NovoGraph, we construct a genome graph with 23,478,835 variant sites and 30,582,795 variant alleles from de novo assemblies of seven ethnically diverse human genomes (AK1, CHM1, CHM13, HG003, HG004, HX1, NA19240). Initial evaluations show that mapping against the constructed graph reduces the average mismatch rate of reads from sample NA12878 by approximately 0.2%, albeit at a slightly increased rate of reads that remain unmapped

    Distinct genetic architectures and environmental factors associate with host response to the γ 2-herpesvirus infections

    Get PDF
    Abstract: Kaposi’s sarcoma-associated herpesvirus (KSHV) and Epstein-Barr Virus (EBV) establish life-long infections and are associated with malignancies. Striking geographic variation in incidence and the fact that virus alone is insufficient to cause disease, suggests other co-factors are involved. Here we present epidemiological analysis and genome-wide association study (GWAS) in 4365 individuals from an African population cohort, to assess the influence of host genetic and non-genetic factors on virus antibody responses. EBV/KSHV co-infection (OR = 5.71(1.58–7.12)), HIV positivity (OR = 2.22(1.32–3.73)) and living in a more rural area (OR = 1.38(1.01–1.89)) are strongly associated with immunogenicity. GWAS reveals associations with KSHV antibody response in the HLA-B/C region (p = 6.64 × 10−09). For EBV, associations are identified for VCA (rs71542439, p = 1.15 × 10−12). Human leucocyte antigen (HLA) and trans-ancestry fine-mapping substantiate that distinct variants in HLA-DQA1 (p = 5.24 × 10−44) are driving associations for EBNA-1 in Africa. This study highlights complex interactions between KSHV and EBV, in addition to distinct genetic architectures resulting in important differences in pathogenesis and transmission

    Consumer behaviour and the life-course: shopper reactions to self service grocery shops and supermarkets in England c.1947-1975

    Get PDF
    This is the author accepted manuscript. The final version is available from SAGE Publications via the DOI in this recordThe paper examines the development of self-service grocery shopping from a consumer perspective. Using qualitative data constructed through a nationwide biographical survey and oral histories, it is possible to go beyond contemporary market surveys which give insufficient attention to shopping as a socially and culturally embedded practice. The paper uses the conceptual framework of the life-course, to demonstrate how grocery shopping is a complex activity, in which the retail encounter is shaped by the specific interconnection of different retail formats with consumer characteristics and situational influences. Consumer reactions to retail modernization must be understood in relation to the development of consumer practices at points of transition and stability within the life-course. These practices are accessed by examining retrospective consumer narratives about food shopping

    Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis.

    Get PDF
    Multiple sclerosis is a common disease of the central nervous system in which the interplay between inflammatory and neurodegenerative processes typically results in intermittent neurological disturbance followed by progressive accumulation of disability. Epidemiological studies have shown that genetic factors are primarily responsible for the substantially increased frequency of the disease seen in the relatives of affected individuals, and systematic attempts to identify linkage in multiplex families have confirmed that variation within the major histocompatibility complex (MHC) exerts the greatest individual effect on risk. Modestly powered genome-wide association studies (GWAS) have enabled more than 20 additional risk loci to be identified and have shown that multiple variants exerting modest individual effects have a key role in disease susceptibility. Most of the genetic architecture underlying susceptibility to the disease remains to be defined and is anticipated to require the analysis of sample sizes that are beyond the numbers currently available to individual research groups. In a collaborative GWAS involving 9,772 cases of European descent collected by 23 research groups working in 15 different countries, we have replicated almost all of the previously suggested associations and identified at least a further 29 novel susceptibility loci. Within the MHC we have refined the identity of the HLA-DRB1 risk alleles and confirmed that variation in the HLA-A gene underlies the independent protective effect attributable to the class I region. Immunologically relevant genes are significantly overrepresented among those mapping close to the identified loci and particularly implicate T-helper-cell differentiation in the pathogenesis of multiple sclerosis

    Psoriasis Patients Are Enriched for Genetic Variants That Protect against HIV-1 Disease

    Get PDF
    An important paradigm in evolutionary genetics is that of a delicate balance between genetic variants that favorably boost host control of infection but which may unfavorably increase susceptibility to autoimmune disease. Here, we investigated whether patients with psoriasis, a common immune-mediated disease of the skin, are enriched for genetic variants that limit the ability of HIV-1 virus to replicate after infection. We analyzed the HLA class I and class II alleles of 1,727 Caucasian psoriasis cases and 3,581 controls and found that psoriasis patients are significantly more likely than controls to have gene variants that are protective against HIV-1 disease. This includes several HLA class I alleles associated with HIV-1 control; amino acid residues at HLA-B positions 67, 70, and 97 that mediate HIV-1 peptide binding; and the deletion polymorphism rs67384697 associated with high surface expression of HLA-C. We also found that the compound genotype KIR3DS1 plus HLA-B Bw4-80I, which respectively encode a natural killer cell activating receptor and its putative ligand, significantly increased psoriasis susceptibility. This compound genotype has also been associated with delay of progression to AIDS. Together, our results suggest that genetic variants that contribute to anti-viral immunity may predispose to the development of psoriasis

    Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus.

    Get PDF
    Systemic lupus erythematosus (SLE) is a genetically complex autoimmune disease characterized by loss of immune tolerance to nuclear and cell surface antigens. Previous genome-wide association studies (GWAS) had modest sample sizes, reducing their scope and reliability. Our study comprised 7,219 cases and 15,991 controls of European ancestry, constituting a new GWAS, a meta-analysis with a published GWAS and a replication study. We have mapped 43 susceptibility loci, including ten new associations. Assisted by dense genome coverage, imputation provided evidence for missense variants underpinning associations in eight genes. Other likely causal genes were established by examining associated alleles for cis-acting eQTL effects in a range of ex vivo immune cells. We found an over-representation (n = 16) of transcription factors among SLE susceptibility genes. This finding supports the view that aberrantly regulated gene expression networks in multiple cell types in both the innate and adaptive immune response contribute to the risk of developing SLE

    Nanopore sequencing and assembly of a human genome with ultra-long reads

    Get PDF
    We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ~30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ~3 Mb). Next, we developed a protocol to generate ultra-long reads (N50 > 100kb, up to 882 kb). Incorporating an additional 5×-coverage of these data more than doubled the assembly contiguity (NG50 ~6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4 Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length and closure of gaps in the reference human genome assembly GRCh38

    Geographical and temporal distribution of SARS-CoV-2 clades in the WHO European Region, January to June 2020

    Get PDF
    We show the distribution of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) genetic clades over time and between countries and outline potential genomic surveillance objectives. We applied three genomic nomenclature systems to all sequence data from the World Health Organization European Region available until 10 July 2020. We highlight the importance of real-time sequencing and data dissemination in a pandemic situation, compare the nomenclatures and lay a foundation for future European genomic surveillance of SARS-CoV-2

    State-of-the-art genome inference in the human MHC

    No full text
    The Major Histocompatibility Complex (MHC) on the short arm of chromosome 6 is associated with more diseases than any other region of the genome; it encodes the antigen-presenting Human Leukocyte Antigen (HLA) proteins and is one of the key immunogenetic regions of the genome. Accurate genome inference and interpretation of MHC association signals have traditionally been hampered by the region's uniquely complex features, such as high levels of polymorphism; inter-gene sequence homologies; structural variation; and long-range haplotype structures. Recent algorithmic and technological advances have, however, significantly increased the accessibility of genetic variation in the MHC; these developments include (i) accurate SNP-based HLA type imputation; (ii) genome graph approaches for variation-aware genome inference from next-generation sequencing data; (iii) long-read-based diploid de novo assembly of the MHC; (iv) cost-effective targeted MHC sequencing methods. Applied to hundreds of thousands of samples over the last years, these technologies have already enabled significant biological discoveries, for example in the field of autoimmune disease genetics. Remaining challenges concern the development of integrated methods that leverage haplotype-resolved de novo assembly of the MHC for the development of improved MHC genotyping methods for short reads and the construction of improved reference panels for SNP-based imputation. Improved genome inference in the MHC can crucially contribute to an improved genetic and functional understanding of many immune-related phenotypes and diseases
    corecore