535 research outputs found

    Bioinformatics Workflows for Genomic Variant Discovery, Interpretation and Prioritization

    Get PDF
    Next-generation sequencing (NGS) techniques allow high-throughput detection of a vast amount of variations in a cost-efficient manner. However, there still are inconsistencies and debates about how to process and analyse this ‘big data’. To accurately extract clinically relevant information from genomics data, choosing appropriate tools, knowing how to best utilize them and interpreting the results correctly is crucial. This chapter reviews state-of-the-art bioinformatics approaches in clinically relevant genomic variant detection. Best practices of reads-to-variant discovery workflows for germline and somatic short genomic variants are presented along with the most commonly utilized tools for each step. Additionally, methods for detecting structural variations are overviewed. Finally, approaches and current guidelines for clinical interpretation of genomic variants are discussed. As emphasized in this chapter, data processing and variant discovery steps are relatively well-understood. The differences in prioritization algorithms on the other hand can be perplexing, thus creating a bottleneck during interpretation. This review aims to shed light on the pros and cons of these differences to help experts give more informed decisions

    Resources and tools for rare disease variant interpretation

    Get PDF
    Collectively, rare genetic disorders affect a substantial portion of the world’s population. In most cases, those affected face difficulties in receiving a clinical diagnosis and genetic characterization. The understanding of the molecular mechanisms of these diseases and the development of therapeutic treatments for patients are also challenging. However, the application of recent advancements in genome sequencing/analysis technologies and computer-aided tools for predicting phenotype-genotype associations can bring significant benefits to this field. In this review, we highlight the most relevant online resources and computational tools for genome interpretation that can enhance the diagnosis, clinical management, and development of treatments for rare disorders. Our focus is on resources for interpreting single nucleotide variants. Additionally, we present use cases for interpreting genetic variants in clinical settings and review the limitations of these results and prediction tools. Finally, we have compiled a curated set of core resources and tools for analyzing rare disease genomes. Such resources and tools can be utilized to develop standardized protocols that will enhance the accuracy and effectiveness of rare disease diagnosis

    Resources and tools for rare disease variant interpretation

    Get PDF
    : Collectively, rare genetic disorders affect a substantial portion of the world's population. In most cases, those affected face difficulties in receiving a clinical diagnosis and genetic characterization. The understanding of the molecular mechanisms of these diseases and the development of therapeutic treatments for patients are also challenging. However, the application of recent advancements in genome sequencing/analysis technologies and computer-aided tools for predicting phenotype-genotype associations can bring significant benefits to this field. In this review, we highlight the most relevant online resources and computational tools for genome interpretation that can enhance the diagnosis, clinical management, and development of treatments for rare disorders. Our focus is on resources for interpreting single nucleotide variants. Additionally, we present use cases for interpreting genetic variants in clinical settings and review the limitations of these results and prediction tools. Finally, we have compiled a curated set of core resources and tools for analyzing rare disease genomes. Such resources and tools can be utilized to develop standardized protocols that will enhance the accuracy and effectiveness of rare disease diagnosis

    Bioinformatic Challenges Detecting Genetic Variation in Precision Medicine Programs

    Get PDF
    Precision medicine programs to identify clinically relevant genetic variation have been revolutionized by access to increasingly affordable high-throughput sequencing technologies. A decade of continual drops in per-base sequencing costs means it is now feasible to sequence an individual patient genome and interrogate all classes of genetic variation for < $1,000 USD. However, while advances in these technologies have greatly simplified the ability to obtain patient sequence information, the timely analysis and interpretation of variant information remains a challenge for the rollout of large-scale precision medicine programs. This review will examine the challenges and potential solutions that exist in identifying predictive genetic biomarkers and pharmacogenetic variants in a patient and discuss the larger bioinformatic challenges likely to emerge in the future. It will examine how both software and hardware development are aiming to overcome issues in short read mapping, variant detection and variant interpretation. It will discuss the current state of the art for genetic disease and the remaining challenges to overcome for complex disease. Success across all types of disease will require novel statistical models and software in order to ensure precision medicine programs realize their full potential now and into the future

    Best practices for bioinformatic characterization of neoantigens for clinical utility

    Get PDF
    Neoantigens are newly formed peptides created from somatic mutations that are capable of inducing tumor-specific T cell recognition. Recently, researchers and clinicians have leveraged next generation sequencing technologies to identify neoantigens and to create personalized immunotherapies for cancer treatment. To create a personalized cancer vaccine, neoantigens must be computationally predicted from matched tumor-normal sequencing data, and then ranked according to their predicted capability in stimulating a T cell response. This candidate neoantigen prediction process involves multiple steps, including somatic mutation identification, HLA typing, peptide processing, and peptide-MHC binding prediction. The general workflow has been utilized for many preclinical and clinical trials, but there is no current consensus approach and few established best practices. In this article, we review recent discoveries, summarize the available computational tools, and provide analysis considerations for each step, including neoantigen prediction, prioritization, delivery, and validation methods. In addition to reviewing the current state of neoantigen analysis, we provide practical guidance, specific recommendations, and extensive discussion of critical concepts and points of confusion in the practice of neoantigen characterization for clinical use. Finally, we outline necessary areas of development, including the need to improve HLA class II typing accuracy, to expand software support for diverse neoantigen sources, and to incorporate clinical response data to improve neoantigen prediction algorithms. The ultimate goal of neoantigen characterization workflows is to create personalized vaccines that improve patient outcomes in diverse cancer types

    Commonalities across computational workflows for uncovering explanatory variants in undiagnosed cases

    Get PDF
    PURPOSE: Genomic sequencing has become an increasingly powerful and relevant tool to be leveraged for the discovery of genetic aberrations underlying rare, Mendelian conditions. Although the computational tools incorporated into diagnostic workflows for this task are continually evolving and improving, we nevertheless sought to investigate commonalities across sequencing processing workflows to reveal consensus and standard practice tools and highlight exploratory analyses where technical and theoretical method improvements would be most impactful. METHODS: We collected details regarding the computational approaches used by a genetic testing laboratory and 11 clinical research sites in the United States participating in the Undiagnosed Diseases Network via meetings with bioinformaticians, online survey forms, and analyses of internal protocols. RESULTS: We found that tools for processing genomic sequencing data can be grouped into four distinct categories. Whereas well-established practices exist for initial variant calling and quality control steps, there is substantial divergence across sites in later stages for variant prioritization and multimodal data integration, demonstrating a diversity of approaches for solving the most mysterious undiagnosed cases. CONCLUSION: The largest differences across diagnostic workflows suggest that advances in structural variant detection, noncoding variant interpretation, and integration of additional biomedical data may be especially promising for solving chronically undiagnosed cases

    Sequenciamento de nova geração : explorando aplicações clínicas de dados de Targeted Gene Panel e Whole Exome Sequencing

    Get PDF
    A tecnologia de sequenciamento de nova geração (next-generation sequencing – NGS) e suas aplicações tem sido cada vez mais utilizada na prática médica para elucidar a base molecular de doenças Mendelianas. Embora seja uma poderosa ferramenta de pesquisa, ainda existe uma importante transição quanto à análise dos dados entre as tecnologias tradicionais de sequenciamento e o NGS. A primeira parte deste trabalho aborda aspectos analíticos envolvidos nesta mudança, com foco na plataforma Ion Torrent Personal Genome Machine. Esta é uma plataforma amplamente utilizada para sequenciar painéis de genes, já que esta aplicação requer menor rendimento de dados. Este trabalho demonstra indicadores adequados para avaliar a qualidade de corridas de sequenciamento e também uma estratégia baseada em valores de profundidade de cobertura para avaliar a performance de amplicons em diferentes cenários. Por outro lado, o NGS permitiu a realização de estudos populacionais em larga escala que estão mudando nossa compreensão sobre as variações genéticas humanas. Um desses exemplos são as mutações até então chamadas de silenciosas, que estão sendo implicadas como causadoras de doenças humanas. A segunda parte deste trabalho investiga a patogenicidade de polimorfismos de núcleotídeo único sinônimos (synonymous single nucleotide polymorphisms – sSNP) baseado em dados públicos obtidos do Exome Aggregation Consortium (ExAC) (exac.broadinstitute.org/) utilizando o software Silent Variant Analysis (SilVA) (compbio.cs.toronto.edu/silva/) e outros recursos para reunir informações adicionais sobre consequências funcionais visando fornecer um panorama dos efeitos patogênicos de sSNP em mais de 60.000 exomas humanos. Nós demonstramos que de 1,691,045 variantes sinônimas, um total de 26,034 foram classificadas como patogênicas pelo SilVA, com frequência alélica menor que 0,05. Análises funcionais in silico revelaram que as variantes sinônimas patogênicas estão envolvidas em processos biológicos importantes, como regulação celular, metabolismo e transporte. Ao expor um cenário de variações sinônimas patogênicas em exomas humanos, nós concluímos que filtrar sSNP em workflows de priorização é razoável, no entanto em situações específicas os sSNP podem ser considerados. Pesquisas futuras neste campo poderão fornecer uma imagem clara do papel de tais variações em doenças genéticas.Next-generation sequencing (NGS) technologies and its applications are increasingly used in medicine to elucidate the molecular basis of Mendelian diseases. Although it is a powerful research tool, there is still an important transition regarding data analysis between traditional sequencing techniques and NGS. The first part of this work addresses analytical aspects involved on this switch-over, focusing on the Ion Torrent Personal Genome Machine platform. This is a widely used platform for sequencing gene panels, as this application demands lower throughput of data. We present indicators suitable to evaluate quality of sequencing runs and also a strategy based on depth of coverage values to evaluate amplicon performance on different scenarios. On the other hand, NGS enabled large-scale population studies that are changing our understanding about human genetic variations. One of these examples are the so-called silent mutations, that are being implied as causative of human diseases. The second part of this work investigates the pathogenicity of synonymous single nucleotide polymorphisms (sSNP) based on public data obtained from the Exome Aggregation Consortium (ExAC) (exac.broadinstitute.org/) using the software Silent Variant Analysis (SilVA) (compbio.cs.toronto.edu/silva/) and other sources to gather additional information about affected protein domains, mRNA folding and functional consequences aiming to provide a landscape of harmfulness of sSNP on more than 60,000 human exomes. We show that from 1,691,045 synonymous variants a total of 26,034 were classified as pathogenic and by SilVA, with allele frequency lower than 0.05. In silico functional analysis revealed that pathogenic synonymous variants found are involved in important biological process, such as cellular regulation, metabolism and transport. By exposing a scenario of pathogenic synonymous variants on human exomes we conclude that filtering out sSNP on prioritization workflows is reasonable, although in some specific cases sSNP should be considered. Future research on this field will provide a clear picture of such variations on genetic diseases

    Knowledge Discovery in Biological Databases for Revealing Candidate Genes Linked to Complex Phenotypes

    Get PDF
    Genetics and “omics” studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.&nbsp
    corecore