3 research outputs found

    Bioinformatic tools to alleviate the annotation bottleneck within precision oncology

    Get PDF
    In the era of advanced ability to perform complex genomic sequencing, precision oncology has been adopted as the ideal paradigm for optimization of outcomes for patients with cancer. However, despite technological advances in all aspects of the massively parallel sequencing pipeline, the application of precision oncology to every clinical workflow has been unattainable. Suboptimal adoption of custom medicine within oncology is attributable to the annotation bottleneck, which currently demands inordinate manual and computational requirements for completion. Alleviation of the annotation bottleneck requires co-development of bioinformatic strategies and analysis knowledgebanks to automate variant identification and variant annotation for clinical utility. The body of work presented here provides validated methods to alleviate the annotation bottleneck within the precision oncology pipeline. The introduction describes the specific aspects of the massively parallel sequencing pipeline that require development. Subsequently, we present three tools (DeepSVR, a Manual Review Standard Operating Procedure, and OpenCAP) that were developed to improve upon existing methods for variant identification and annotation. DeepSVR provides a machine learning approach to improve automated somatic variant calling by reducing false positives associated with sequencing pipelines that are observable by manual reviewers. The Manual Review Standard Operating Procedure provides a systemic and standardized approach for manual review of aligned sequencing reads for sequencing data with paired tumor and normal samples. Finally, the Open-sourced CIViC Annotation Pipeline (OpenCAP) serves as a software to create rationally designed clinical capture panels that are linked to clinical relevance summaries to improve library preparation and clinical annotation. The combined utility of these three tools for alleviation of the analysis bottleneck are demonstrated using a clinical example. Specifically, we developed a targeted clinical capture panel (MyeloSeq) to evaluate recurrent mutations observed in myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML). The MyeloSeq sequencing pipeline incorporated many of the tools described above for variant identification and annotation and provides a succinct output report for physician consumption. When surveying physicians who utilize the MyeloSeq panel, we observed that over 44% of physicians changed their treatment protocol based on the MyeloSeq results. This included 39 new therapeutics prescribes, 4 definitive diagnoses, and 13 changes in treatment plan (stem-cell transplant versus chemotherapy) based on prognostic indicators. This example demonstrates that the developed tools help alleviate the analysis bottleneck within precision oncology and will improve physician’s ability to integrate precision medicine into clinical workflow

    Variant information systems for precision oncology

    No full text
    Abstract Background The decreasing cost of obtaining high-quality calls of genomic variants and the increasing availability of clinically relevant data on such variants are important drivers for personalized oncology. To allow rational genome-based decisions in diagnosis and treatment, clinicians need intuitive access to up-to-date and comprehensive variant information, encompassing, for instance, prevalence in populations and diseases, functional impact at the molecular level, associations to druggable targets, or results from clinical trials. In practice, collecting such comprehensive information on genomic variants is difficult since the underlying data is dispersed over a multitude of distributed, heterogeneous, sometimes conflicting, and quickly evolving data sources. To work efficiently, clinicians require powerful Variant Information Systems (VIS) which automatically collect and aggregate available evidences from such data sources without suppressing existing uncertainty. Methods We address the most important cornerstones of modeling a VIS: We take from emerging community standards regarding the necessary breadth of variant information and procedures for their clinical assessment, long standing experience in implementing biomedical databases and information systems, our own clinical record of diagnosis and treatment of cancer patients based on molecular profiles, and extensive literature review to derive a set of design principles along which we develop a relational data model for variant level data. In addition, we characterize a number of public variant data sources, and describe a data integration pipeline to integrate their data into a VIS. Results We provide a number of contributions that are fundamental to the design and implementation of a comprehensive, operational VIS. In particular, we (a) present a relational data model to accurately reflect data extracted from public databases relevant for clinical variant interpretation, (b) introduce a fault tolerant and performant integration pipeline for public variant data sources, and (c) offer recommendations regarding a number of intricate challenges encountered when integrating variant data for clincal interpretation. Conclusion The analysis of requirements for representation of variant level data in an operational data model, together with the implementation-ready relational data model presented here, and the instructional description of methods to acquire comprehensive information to fill it, are an important step towards variant information systems for genomic medicine
    corecore