149 research outputs found

    A framework for the detection of de novo mutations in family-based sequencing data

    Get PDF
    Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports

    Improved imputation quality of low-frequency and rare variants in European samples using the 'Genome of the Netherlands'

    Get PDF
    Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with 'true' genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed significant improvement in the imputation quality for rare variants (MAF 0.05-0.5%) compared with 1000G. In Dutch samples, the mean observed Pearson correlation, r 2, increased from 0.61 to 0.71. W

    The Genome of the Netherlands: Design, and project goals

    Get PDF
    Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project

    A high-quality human reference panel reveals the complexity and distribution of genomic structural variants

    Get PDF
    Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation. Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100 bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals

    WGS-based telomere length analysis in Dutch family trios implicates stronger maternal inheritance and a role for RRM1 gene

    Get PDF
    Telomere length (TL) regulation is an important factor in ageing, reproduction and cancer development. Genetic, hereditary and environmental factors regulating TL are currently widely investigated, however, their relative contribution to TL variability is still understudied. We have used whole genome sequencing data of 250 family trios from the Genome of the Netherlands project to perform computational measurement of TL and a series of regression and genome-wide association analyses to reveal TL inheritance patterns and associated genetic factors. Our results confirm that TL is a largely heritable trait, primarily with mother’s, and, to a lesser extent, with father’s TL having the strongest influence on the offspring. In this cohort, mother’s, but not father’s age at conception was positively linked to offspring TL. Age-related TL attrition of 40 bp/year had relatively small influence on TL variability. Finally, we have identified TL-associated variations in ribonuclease reductase catalytic subunit M1 (RRM1 gene), which is known to regulate telomere maintenance in yeast. We also highlight the importance of multivariate approach and the limitations of existing tools for the analysis of TL as a polygenic heritable quantitative trait

    Velocity-space sensitivity of the time-of-flight neutron spectrometer at JET

    Get PDF
    The velocity-space sensitivities of fast-ion diagnostics are often described by so-called weight functions. Recently, we formulated weight functions showing the velocity-space sensitivity of the often dominant beam-target part of neutron energy spectra. These weight functions for neutron emission spectrometry (NES) are independent of the particular NES diagnostic. Here we apply these NES weight functions to the time-of-flight spectrometer TOFOR at JET. By taking the instrumental response function of TOFOR into account, we calculate time-of-flight NES weight functions that enable us to directly determine the velocity-space sensitivity of a given part of a measured time-of-flight spectrum from TOFOR

    Characteristics of de novo structural changes in the human genome

    Get PDF
    Small insertions and deletions (indels) and large structural variations (SVs) are major contributors to human genetic diversity and disease. However, mutation rates and characteristics of de novo indels and SVs in the general population have remained largely unexplored. We report 332 validated de novo structural changes identified in whole genomes of 250 families, including complex indels, retrotransposon insertions, and interchromosomal events. These data indicate a mutation rate of 2.94 indels (120 bp) and 0.16 SVs (>20 bp) per generation. De novo structural changes affect on average 4.1 kbp of genomic sequence and 29 coding bases per generation, which is 91 and 52 times more nucleotides than de novo substitutions, respectively. This contrasts with the equal genomic footprint of inherited SVs and substitutions. An excess of structural changes originated on paternal haplotypes. Additionally, we observed a nonuniform distribution of de novo SVs across offspring. These results reveal the importance of different mutational mechanisms to changes in human genome structure across generations

    The formation and fate of internal waves in the South China Sea

    Get PDF
    Internal gravity waves, the subsurface analogue of the familiar surface gravity waves that break on beaches, are ubiquitous in the ocean. Because of their strong vertical and horizontal currents, and the turbulent mixing caused by their breaking, they affect a panoply of ocean processes, such as the supply of nutrients for photosynthesis1, sediment and pollutant transport2 and acoustic transmission3; they also pose hazards for man-made structures in the ocean4. Generated primarily by the wind and the tides, internal waves can travel thousands of kilometres from their sources before breaking5, making it challenging to observe them and to include them in numerical climate models, which are sensitive to their effects6,7. For over a decade, studies8-11 have targeted the South China Sea, where the oceans' most powerful known internal waves are generated in the Luzon Strait and steepen dramatically as they propagate west. Confusion has persisted regarding their mechanism of generation, variability and energy budget, however, owing to the lack of in situ data from the Luzon Strait, where extreme flow conditions make measurements difficult. Here we use new observations and numerical models to (1) show that the waves begin as sinusoidal disturbances rather than arising from sharp hydraulic phenomena, (2) reveal the existence of >200-metre-high breaking internal waves in the region of generation that give rise to turbulence levels >10,000 times that in the open ocean, (3) determine that the Kuroshio western boundary current noticeably refracts the internal wave field emanating from the Luzon Strait, and (4) demonstrate a factor-of-two agreement between modelled and observed energy fluxes, which allows us to produce an observationally supported energy budget of the region. Together, these findings give a cradle-to-grave picture of internal waves on a basin scale, which will support further improvements of their representation in numerical climate predictions

    Relationship of edge localized mode burst times with divertor flux loop signal phase in JET

    Get PDF
    A phase relationship is identified between sequential edge localized modes (ELMs) occurrence times in a set of H-mode tokamak plasmas to the voltage measured in full flux azimuthal loops in the divertor region. We focus on plasmas in the Joint European Torus where a steady H-mode is sustained over several seconds, during which ELMs are observed in the Be II emission at the divertor. The ELMs analysed arise from intrinsic ELMing, in that there is no deliberate intent to control the ELMing process by external means. We use ELM timings derived from the Be II signal to perform direct time domain analysis of the full flux loop VLD2 and VLD3 signals, which provide a high cadence global measurement proportional to the voltage induced by changes in poloidal magnetic flux. Specifically, we examine how the time interval between pairs of successive ELMs is linked to the time-evolving phase of the full flux loop signals. Each ELM produces a clear early pulse in the full flux loop signals, whose peak time is used to condition our analysis. The arrival time of the following ELM, relative to this pulse, is found to fall into one of two categories: (i) prompt ELMs, which are directly paced by the initial response seen in the flux loop signals; and (ii) all other ELMs, which occur after the initial response of the full flux loop signals has decayed in amplitude. The times at which ELMs in category (ii) occur, relative to the first ELM of the pair, are clustered at times when the instantaneous phase of the full flux loop signal is close to its value at the time of the first ELM
    • …
    corecore