2 research outputs found
Genome Annotation using Nanopublications: An Approach to Interoperability of Genetic Data
<p>With the wide spread use of Next Generation Sequencing (NGS) technologies, the primary bottleneck of genetic research has shifted from data production to data analysis. However, annotated datasets produced by different research groups are often in different formats, making genetic comparisons and integration with other datasets challenging and time consuming tasks. Here, we propose a new data interoperability approach that provides unambiguous (machine readable) description of genomic annotations based on a novel method of data publishing called nanopublication. A nanopublication is a schema built on top of existing semantic web technologies that consists of three components: an individual assertion (i.e., the genomic annotation); provenance (containing links to the experimental information and data processing steps); and publication info (information about data ownership and rights, allowing each genomic annotation to be citable and its scientific impact tracked ). We use nanopublications to demonstrate automatic interoperability between individual genomic annotations from the FANTOM5 consortium (transcription start sites) and the Leiden Open Variation Database (genetic variants). The nanopublications can also be integrated with the data of the other semantic web frameworks like COEUS. Exposing legacy information and new NGS data as nanopublications promises tremendous scaling advantages when integrating very large and heterogeneous genetic datasets.</p
Dutch genome diagnostic laboratories accelerated and improved variant interpretation and increased accuracy by sharing data
Each year diagnostic laboratories in the Netherlands profile thousands of individuals for heritable disease using next-generation sequencing (NGS). This requires pathogenicity classification of millions of DNA variants on the standard 5-tier scale. To reduce time spent on data interpretation and increase data quality and reliability, the nine Dutch labs decided to publicly share their classifications. Variant classifications of nearly 100,000 unique variants were catalogued and compared in a centralized MOLGENIS database. Variants classified by more than one center were labeled as âconsensusâ when classifications agreed, and shared internationally with LOVD and ClinVar. When classifications opposed (LB/B vs. LP/P), they were labeled âconflictingâ, while other nonconsensus observations were labeled âno consensusâ. We assessed our classifications using the InterVar software to compare to ACMG 2015 guidelines, showing 99.7% overall consistency with only 0.3% discrepancies. Differences in classifications between Dutch labs or between Dutch labs and ACMG were mainly present in genes with low penetrance or for late onset disorders and highlight limitations of the current 5-tier classification system. The data sharing boosted the quality of DNA diagnostics in Dutch labs, an initiative we hope will be followed internationally. Recently, a positive match with a case from outside our consortium resulted in a more definite disease diagnosis