174 research outputs found

    Automated extraction of potential migraine biomarkers using a semantic graph

    Get PDF
    Problem Biomedical literature and databases contain important clues for the identification of potential disease biomarkers. However, searching these enormous knowledge reservoirs and integrating findings across heterogeneous sources is costly and difficult. Here we demonstrate how semantically integrated knowledge, extracted from biomedical literature and structured databases, can be used to automatically identify potential migraine biomarkers. Method We used a knowledge graph containing more than 3.5 million biomedical concepts and 68.4 million relationships. Biochemical compound concepts were filtered and ranked by their potential as biomarkers based on their connections to a subgraph of migraine-related concepts. The ranked results were evaluated against the results of a systematic literature review that was performed manually by migraine researchers. Weight points were assigned to these reference compounds to indicate their relative importance. Results Ranked results automatically generated by the knowledge graph were highly consistent with results from the manual literature review. Out of 222 reference compounds, 163 (73%) ranked in the top 2000, with 547 out of the 644 (85%) weight points assigned to the reference compounds. For reference compounds that were not in the top of the list, an extensive error analysis has been performed. When evaluating the overall performance, we obtained a ROC-AUC of 0.974. Discussion Semantic knowledge graphs composed of information integrated from multiple and varying sources can assist researchers in identifying potential disease biomarkers

    The value of semantics in biomedical knowledge graphs

    Get PDF
    Knowledge graphs use a graph-based data model to represent knowledge of the real world. They consist of nodes, which represent entities of interest such as diseases or proteins, and edges, which represent potentially different relations between these entities. Semantic properties can be attached to these nodes and edges, indicating the classes of entities they represent (e.g. gene, disease), the predicates that indicate the types of relationships between the nodes (e.g. stimulates, treats), and provenance that provides references to the sources of these relationships.Modelling knowledge as a graph emphasizes the interrelationships between the entities, making knowledge graphs a useful tool for performing computational analyses for domains in which complex interactions and sequences of events exist, such as biomedicine. Semantic properties provide additional information and are assumed to benefit such computational analyses but the added value of these properties has not yet been extensively investigated.This thesis therefore develops and compares computational methods that use these properties, and applies them to biomedical tasks. These are: biomarker identification, drug repurposing, drug efficacy screening, identifying disease trajectories, and identifying genes targeted by disease-associated SNPs located on the non-coding part of the genome.In general, we find that methods which use concept classes, predicates, or provenance improves achieve a superior performance over methods that do not use them. We thereby demonstrate the added value of these semantic properties for computational analyses performed on biomedical knowledge graphs.<br/

    Dual-3DM3-AD : Mixed Transformer based Semantic Segmentation and Triplet Pre-processing for Early Multi-Class Alzheimer’s Diagnosis

    Get PDF
    Alzheimer’s Disease (AD) is a widespread, chronic, irreversible, and degenerative condition, and its early detection during the prodromal stage is of utmost importance. Typically, AD studies rely on single data modalities, such as MRI or PET, for making predictions. Nevertheless, combining metabolic and structural data can offer a comprehensive perspective on AD staging analysis. To address this goal, this paper introduces an innovative multi-modal fusion-based approach named as Dual-3DM3-AD. This model is proposed for an accurate and early Alzheimer’s diagnosis by considering both MRI and PET image scans. Initially, we pre-process both images in terms of noise reduction, skull stripping and 3D image conversion using Quaternion Non-local Means Denoising Algorithm (QNLM), Morphology function and Block Divider Model (BDM), respectively, which enhances the image quality. Furthermore, we have adapted Mixed-transformer with Furthered U-Net for performing semantic segmentation and minimizing complexity. Dual-3DM3-AD model is consisted of multi-scale feature extraction module for extracting appropriate features from both segmented images. The extracted features are then aggregated using Densely Connected Feature Aggregator Module (DCFAM) to utilize both features. Finally, a multi-head attention mechanism is adapted for feature dimensionality reduction, and then the softmax layer is applied for multi-class Alzheimer’s diagnosis. The proposed Dual-3DM3-AD model is compared with several baseline approaches with the help of several performance metrics. The final results unveil that the proposed work achieves 98% of accuracy, 97.8% of sensitivity, 97.5% of specificity, 98.2% of f-measure, and better ROC curves, which outperforms other existing models in multi-class Alzheimer’s diagnosis.© 2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/fi=vertaisarvioitu|en=peerReviewed

    Generation and Applications of Knowledge Graphs in Systems and Networks Biology

    Get PDF
    The acceleration in the generation of data in the biomedical domain has necessitated the use of computational approaches to assist in its interpretation. However, these approaches rely on the availability of high quality, structured, formalized biomedical knowledge. This thesis has the two goals to improve methods for curation and semantic data integration to generate high granularity biological knowledge graphs and to develop novel methods for using prior biological knowledge to propose new biological hypotheses. The first two publications describe an ecosystem for handling biological knowledge graphs encoded in the Biological Expression Language throughout the stages of curation, visualization, and analysis. Further, the second two publications describe the reproducible acquisition and integration of high-granularity knowledge with low contextual specificity from structured biological data sources on a massive scale and support the semi-automated curation of new content at high speed and precision. After building the ecosystem and acquiring content, the last three publications in this thesis demonstrate three different applications of biological knowledge graphs in modeling and simulation. The first demonstrates the use of agent-based modeling for simulation of neurodegenerative disease biomarker trajectories using biological knowledge graphs as priors. The second applies network representation learning to prioritize nodes in biological knowledge graphs based on corresponding experimental measurements to identify novel targets. Finally, the third uses biological knowledge graphs and develops algorithmics to deconvolute the mechanism of action of drugs, that could also serve to identify drug repositioning candidates. Ultimately, the this thesis lays the groundwork for production-level applications of drug repositioning algorithms and other knowledge-driven approaches to analyzing biomedical experiments

    Mining Novellas from PubMed Abstracts using a Storytelling Algorithm

    Get PDF
    Motivation: There are now a multitude of articles published in a diversity of journals providing information about genes, proteins, pathways, and entire processes. Each article investigates particular subsets of a biological process, but to gain insight into the functioning of a system as a whole, we must computationally integrate information across multiple publications. This is especially important in problems such as modeling cross-talk in signaling networks, designing drug therapies for combinatorial selectivity, and unraveling the role of gene interactions in deleterious phenotypes, where the cost of performing combinatorial screens is exorbitant. Results: We present an automated approach to biological knowledge discovery from PubMed abstracts, suitable for unraveling combinatorial relationships. It involves the systematic application of a `storytelling' algorithm followed by compression of the stories into `novellas.' Given a start and end publication, typically with little or no overlap in content, storytelling identifies a chain of intermediate publications from one to the other, such that neighboring publications have significant content similarity. Stories discovered thus provide an argued approach to relate distant concepts through compositions of related concepts. The chains of links employed by stories are then mined to find frequently reused sub-stories, which can be compressed to yield novellas, or compact templates of connections. We demonstrate a successful application of storytelling and novella finding to modeling combinatorial relationships between introduction of extracellular factors and downstream cellular events. Availability: A story visualizer, suitable for interactive exploration of stories and novellas described in this paper, is available for demo/download at https://bioinformatics.cs.vt.edu/storytelling

    Ontologies in medicinal chemistry: current status and future challenges

    Get PDF
    [Abstract] Recent years have seen a dramatic increase in the amount and availability of data in the diverse areas of medicinal chemistry, making it possible to achieve significant advances in fields such as the design, synthesis and biological evaluation of compounds. However, with this data explosion, the storage, management and analysis of available data to extract relevant information has become even a more complex task that offers challenging research issues to Artificial Intelligence (AI) scientists. Ontologies have emerged in AI as a key tool to formally represent and semantically organize aspects of the real world. Beyond glossaries or thesauri, ontologies facilitate communication between experts and allow the application of computational techniques to extract useful information from available data. In medicinal chemistry, multiple ontologies have been developed during the last years which contain knowledge about chemical compounds and processes of synthesis of pharmaceutical products. This article reviews the principal standards and ontologies in medicinal chemistry, analyzes their main applications and suggests future directions.Instituto de Salud Carlos III; FIS-PI10/02180Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; 209RT0366Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; CN2012/217Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; CN2011/034Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; CN2012/21

    In Search of a Common Thread: Enhancing the LBD Workflow with a view to its Widespread Applicability

    Get PDF
    Literature-Based Discovery (LBD) research focuses on discovering implicit knowledge linkages in existing scientific literature to provide impetus to innovation and research productivity. Despite significant advancements in LBD research, previous studies contain several open problems and shortcomings that are hindering its progress. The overarching goal of this thesis is to address these issues, not only to enhance the discovery component of LBD, but also to shed light on new directions that can further strengthen the existing understanding of the LBD work ow. In accordance with this goal, the thesis aims to enhance the LBD work ow with a view to ensuring its widespread applicability. The goal of widespread applicability is twofold. Firstly, it relates to the adaptability of the proposed solutions to a diverse range of problem settings. These problem settings are not necessarily application areas that are closely related to the LBD context, but could include a wide range of problems beyond the typical scope of LBD, which has traditionally been applied to scientific literature. Adapting the LBD work ow to problems outside the typical scope of LBD is a worthwhile goal, since the intrinsic objective of LBD research, which is discovering novel linkages in text corpora is valid across a vast range of problem settings. Secondly, the idea of widespread applicability also denotes the capability of the proposed solutions to be executed in new environments. These `new environments' are various academic disciplines (i.e., cross-domain knowledge discovery) and publication languages (i.e., cross-lingual knowledge discovery). The application of LBD models to new environments is timely, since the massive growth of the scientific literature has engendered huge challenges to academics, irrespective of their domain. This thesis is divided into five main research objectives that address the following topics: literature synthesis, the input component, the discovery component, reusability, and portability. The objective of the literature synthesis is to address the gaps in existing LBD reviews by conducting the rst systematic literature review. The input component section aims to provide generalised insights on the suitability of various input types in the LBD work ow, focusing on their role and potential impact on the information retrieval cycle of LBD. The discovery component section aims to intermingle two research directions that have been under-investigated in the LBD literature, `modern word embedding techniques' and `temporal dimension' by proposing diachronic semantic inferences. Their potential positive in uence in knowledge discovery is veri ed through both direct and indirect uses. The reusability section aims to present a new, distinct viewpoint on these LBD models by verifying their reusability in a timely application area using a methodical reuse plan. The last section, portability, proposes an interdisciplinary LBD framework that can be applied to new environments. While highly cost-e cient and easily pluggable, this framework also gives rise to a new perspective on knowledge discovery through its generalisable capabilities. Succinctly, this thesis presents novel and distinct viewpoints to accomplish five main research objectives, enhancing the existing understanding of the LBD work ow. The thesis offers new insights which future LBD research could further explore and expand to create more eficient, widely applicable LBD models to enable broader community benefits.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202
    corecore