15,915 research outputs found

    MutationDistiller: user-driven identification of pathogenic DNA variants

    Get PDF
    MutationDistiller is a freely available online tool for user-driven analyses of Whole Exome Sequencing data. It offers a user-friendly interface aimed at clinicians and researchers, who are not necessarily bioinformaticians. MutationDistiller combines Mutation- Taster’s pathogenicity predictions with a phenotypebased approach. Phenotypic information is not limited to symptoms included in the Human Phenotype Ontology (HPO), but may also comprise clinical diagnoses and the suspected mode of inheritance. The search can be restricted to lists of candidate genes (e.g. virtual gene panels) and by tissue-specific gene expression. The inclusion of GeneOntology (GO) and metabolic pathways facilitates the discovery of hitherto unknown disease genes. In a novel approach, we trained MutationDistiller’s HPO-based prioritization on authentic genotype–phenotype sets obtained from ClinVar and found it to match or outcompete current prioritization tools in terms of accuracy. In the output, the program provides a list of potential disease mutations ordered by the likelihood of the affected genes to cause the phenotype. MutationDistiller provides links to gene-related information from various resources. It has been extensively tested by clinicians and their suggestions have been valued in many iterative cycles of revisions. The tool, a comprehensive documentation and examples are freely available at https://www.mutationdistiller.org

    Improving dbNSFP

    Get PDF
    IMPROVING dbNSFP Mingyao Lu, B.S. Advisory Professor: Xiaoming Liu, Ph.D. The analysis and interpretation of DNA variation are very important for the Whole Exome studies (WES). Genome research has focused on single nucleotide variants (SNVs). Since indels are as important as SNVs, especially indels in coding regions are often candidates of disease-causing variants, thus, it is necessary to expand the focus to include indel mutations. The goal of my project is to provide an automatic annotation pipeline to the WES based disease studies project by extending the dbNSFP with a tool for automated indel annotation and deleteriousness prediction. The current sequencing results typically include both SNVs and indels. Although there have been many available tools to integrate functional prediction/annotations for SNV effects, there are no such tools for indels to my knowledge. Therefore, the aim of this thesis was to add deleteriousness prediction scores to indel annotation based on gene models, including CADD, SIFT, and PROVEAN. All those scores can be calculated on-the-fly after installing resources locally. A Docker implementing the indel annotation and deleteriousness prediction has been developed and ready to be deployed from the cloud

    Bioinformatics advances in saliva diagnostics

    Get PDF
    There is a need recognized by the National Institute of Dental & Craniofacial Research and the National Cancer Institute to advance basic, translational and clinical saliva research. The goal of the Salivaomics Knowledge Base (SKB) is to create a data management system and web resource constructed to support human salivaomics research. To maximize the utility of the SKB for retrieval, integration and analysis of data, we have developed the Saliva Ontology and SDxMart. This article reviews the informatics advances in saliva diagnostics made possible by the Saliva Ontology and SDxMart

    PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers

    Get PDF
    BACKGROUND: Long thought "relics" of evolution, not until recently have pseudogenes been of medical interest regarding regulation in cancer. Often, these regulatory roles are a direct by-product of their close sequence homology to protein-coding genes. Novel pseudogene-gene (PGG) functional associations can be identified through the integration of biomedical data, such as sequence homology, functional pathways, gene expression, pseudogene expression, and microRNA expression. However, not all of the information has been integrated, and almost all previous pseudogene studies relied on 1:1 pseudogene-parent gene relationships without leveraging other homologous genes/pseudogenes. RESULTS: We produce PGG families that expand beyond the current 1:1 paradigm. First, we construct expansive PGG databases by (i) CUDAlign graphics processing unit (GPU) accelerated local alignment of all pseudogenes to gene families (totaling 1.6 billion individual local alignments and >40,000 GPU hours) and (ii) BLAST-based assignment of pseudogenes to gene families. Second, we create an open-source web application (PseudoFuN [Pseudogene Functional Networks]) to search for integrative functional relationships of sequence homology, microRNA expression, gene expression, pseudogene expression, and gene ontology. We produce four "flavors" of CUDAlign-based databases (>462,000,000 PGG pairwise alignments and 133,770 PGG families) that can be queried and downloaded using PseudoFuN. These databases are consistent with previous 1:1 PGG annotation and also are much more powerful including millions of de novo PGG associations. For example, we find multiple known (e.g., miR-20a-PTEN-PTENP1) and novel (e.g., miR-375-SOX15-PPP4R1L) microRNA-gene-pseudogene associations in prostate cancer. PseudoFuN provides a "one stop shop" for identifying and visualizing thousands of potential regulatory relationships related to pseudogenes in The Cancer Genome Atlas cancers. CONCLUSIONS: Thousands of new PGG associations can be explored in the context of microRNA-gene-pseudogene co-expression and differential expression with a simple-to-use online tool by bioinformaticians and oncologists alike
    • …
    corecore