24 research outputs found

    Systematic evaluation of NIPT aneuploidy detection software tools with clinically validated NIPT samples

    Get PDF
    Non-invasive prenatal testing (NIPT) is a powerful screening method for fetal aneuploidy detection, relying on laboratory and computational analysis of cell-free DNA. Although several published computational NIPT analysis tools are available, no prior comprehensive, head-to-head accuracy comparison of the various tools has been published. Here, we compared the outcome accuracies obtained for clinically validated samples with five commonly used computational NIPT aneuploidy analysis tools (WisecondorX, NIPTeR, NIPTmer, RAPIDR, and GIPseq) across various sequencing depths (coverage) and fetal DNA fractions. The sample set included cases of fetal trisomy 21 (Down syndrome), trisomy 18 (Edwards syndrome), and trisomy 13 (Patau syndrome). We determined that all of the compared tools were considerably affected by lower sequencing depths, such that increasing proportions of undetected trisomy cases (false negatives) were observed as the sequencing depth decreased. We summarised our benchmarking results and highlighted the advantages and disadvantages of each computational NIPT software. To conclude, trisomy detection for lower coverage NIPT samples (e.g. 2.5M reads per sample) is technically possible but can, with some NIPT tools, produce troubling rates of inaccurate trisomy detection, especially in low-FF samples.Author summaryNon-invasive prenatal testing analysis relies on computational algorithms that are used for inferring chromosomal aneuploidies, such as chromosome 21 triploidy in the case of Down syndrome. However, the performance of these algorithms has not been compared on the same clinically validated data. Here we conducted a head-to-head comparison of WGS-based NIPT aneuploidy detection tools. Our findings indicate that at and below 2.5M reads per sample, the least accurate algorithm would miss detection of almost a third of trisomy cases. Furthermore, we describe and quantify a previously undocumented aneuploidy risk uncertainty that is mainly relevant in cases of very low sequencing coverage (at and below 1.25M reads per sample) and could, in the worst-case scenario, lead to a false negative rate of 245 undetected trisomies per 1,000 trisomy cases. Our findings underscore the importance of the informed selection of NIPT software tools in combination with sequencing coverage, which directly impacts NIPT sequencing cost and accuracy

    Systematic evaluation of NIPT aneuploidy detection software tools with clinically validated NIPT samples

    Get PDF
    Non-invasive prenatal testing (NIPT) is a powerful screening method for fetal aneuploidy detection, relying on laboratory and computational analysis of cell-free DNA. Although several published computational NIPT analysis tools are available, no prior comprehensive, head-to-head accuracy comparison of the various tools has been published. Here, we compared the outcome accuracies obtained for clinically validated samples with five commonly used computational NIPT aneuploidy analysis tools (WisecondorX, NIPTeR, NIPTmer, RAPIDR, and GIPseq) across various sequencing depths (coverage) and fetal DNA fractions. The sample set included cases of fetal trisomy 21 (Down syndrome), trisomy 18 (Edwards syndrome), and trisomy 13 (Patau syndrome). We determined that all of the compared tools were considerably affected by lower sequencing depths, such that increasing proportions of undetected trisomy cases (false negatives) were observed as the sequencing depth decreased. We summarised our benchmarking results and highlighted the advantages and disadvantages of each computational NIPT software. To conclude, trisomy detection for lower coverage NIPT samples (e.g. 2.5M reads per sample) is technically possible but can, with some NIPT tools, produce troubling rates of inaccurate trisomy detection, especially in low-FF samples. Author summaryNon-invasive prenatal testing analysis relies on computational algorithms that are used for inferring chromosomal aneuploidies, such as chromosome 21 triploidy in the case of Down syndrome. However, the performance of these algorithms has not been compared on the same clinically validated data. Here we conducted a head-to-head comparison of WGS-based NIPT aneuploidy detection tools. Our findings indicate that at and below 2.5M reads per sample, the least accurate algorithm would miss detection of almost a third of trisomy cases. Furthermore, we describe and quantify a previously undocumented aneuploidy risk uncertainty that is mainly relevant in cases of very low sequencing coverage (at and below 1.25M reads per sample) and could, in the worst-case scenario, lead to a false negative rate of 245 undetected trisomies per 1,000 trisomy cases. Our findings underscore the importance of the informed selection of NIPT software tools in combination with sequencing coverage, which directly impacts NIPT sequencing cost and accuracy.Peer reviewe

    Recommendations for whole genome sequencing in diagnostics for rare diseases

    Get PDF
    In 2016, guidelines for diagnostic Next Generation Sequencing (NGS) have been published by EuroGentest in order to assist laboratories in the implementation and accreditation of NGS in a diagnostic setting. These guidelines mainly focused on Whole Exome Sequencing (WES) and targeted (gene panels) sequencing detecting small germline variants (Single Nucleotide Variants (SNVs) and insertions/deletions (indels)). Since then, Whole Genome Sequencing (WGS) has been increasingly introduced in the diagnosis of rare diseases as WGS allows the simultaneous detection of SNVs, Structural Variants (SVs) and other types of variants such as repeat expansions. The use of WGS in diagnostics warrants the re-evaluation and update of previously published guidelines. This work was jointly initiated by EuroGentest and the Horizon2020 project Solve-RD. Statements from the 2016 guidelines have been reviewed in the context of WGS and updated where necessary. The aim of these recommendations is primarily to list the points to consider for clinical (laboratory) geneticists, bioinformaticians, and (non-)geneticists, to provide technical advice, aid clinical decision-making and the reporting of the results

    A Development Framework for Data Analytics in Genomics

    No full text
    The project aim is the deployment of a scalable high-performance data analytics infrastructure for applications in human genetics research and clinical genetic diagnosis. NGS data generation has reached an explosive phase with data throughput currently doubling every six months. High-performance data analytics has become essential to tackle this massive computing challenge, as NGS will shortly rival the most data and computing intensive areas of science. To do the computation on massive information generated by the NGS we should use new technologies and methodologies. Choosing right tools and methods will be the first challenge in the way that we define as our goal. Collection, storage and retrieval of large amounts of data from multiple experiments need knowledge of deployment of cluster or cloud computing infrastructure, on the other hand paralyzing the computation task lead us to use Hadoop/MapReduce strategies. As an interface and application point of view, we need a Rich Client Platforms because of the architecture and flexibility they offer to continually growing applications. Often rich clients are applications that are extendable via plugins and modules. In this way, rich clients are able to solve more than one problem. Rich clients can also potentially solve related problems, as well as those that are completely foreign to their original purpose. Here’s an overview of the characteristics of a rich client: Flexible and modular application architecture Platform independence Adaptability to the end user Ability to work online as well as offline Simplified distribution to the end user Simplified updating of the client The final goal of this thesis will be developed an application based on the NGS and related medical information but distributed in several places, we must define several rule and roles based on institute type with respect to their own privacy. We believe that with this system researcher can share their experiences and knowledge to find more reliable results. Since each institute has their own facility and structures the main challenge will be to find a fixed language, which defines a protocol to connect these structures with the lowest coast and changes.status: publishe

    Galahad: a web server for drug effect analysis from gene expression

    No full text
    Galahad (https://galahad.esat.kuleuven.be) is a web-based application for analysis of drug effects. It provides an intuitive interface to be used by anybody interested in leveraging microarray data to gain insights into the pharmacological effects of a drug, mainly identification of candidate targets, elucidation of mode of action and understanding of off-target effects. The core of Galahad is a network-based analysis method of gene expression. As an input, Galahad takes raw Affymetrix human microarray data from treatment versus control experiments and provides quality control and data exploration tools, as well as computation of differential expression. Alternatively, differential expression values can be uploaded directly. Using these differential expression values, drug target prioritization and both pathway and disease enrichment can be calculated and visualized. Drug target prioritization is based on the integration of the gene expression data with a functional protein association network. The web site is free and open to all and there is no login requirement.status: publishe

    NGS-Logistics: Federated analysis of NGS sequence variants across multiple locations

    Get PDF
    As many personal genomes are being sequenced, collaborative analysis of those genomes has become essential. However, analysis of personal genomic data raises important privacy and confidentiality issues. We propose a methodology for federated analysis of sequence variants from personal genomes. Specific base-pair positions and/or regions are queried for samples to which the user has access but also for the whole population. The statistics results do not breach data confidentiality but allow further exploration of the data; researchers can negotiate access to relevant samples through pseudonymous identifiers. This approach minimizes the impact on data confidentiality while enabling powerful data analysis by gaining access to important rare samples. Our methodology is implemented in an open source tool called NGS-Logistics, freely available at https://ngsl.esat.kuleuven.be.status: publishe

    Beegle: From literature mining to disease-gene discovery

    Get PDF
    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.status: publishe
    corecore