85 research outputs found

    Automated extraction of potential migraine biomarkers using a semantic graph

    Get PDF
    Problem Biomedical literature and databases contain important clues for the identification of potential disease biomarkers. However, searching these enormous knowledge reservoirs and integrating findings across heterogeneous sources is costly and difficult. Here we demonstrate how semantically integrated knowledge, extracted from biomedical literature and structured databases, can be used to automatically identify potential migraine biomarkers. Method We used a knowledge graph containing more than 3.5 million biomedical concepts and 68.4 million relationships. Biochemical compound concepts were filtered and ranked by their potential as biomarkers based on their connections to a subgraph of migraine-related concepts. The ranked results were evaluated against the results of a systematic literature review that was performed manually by migraine researchers. Weight points were assigned to these reference compounds to indicate their relative importance. Results Ranked results automatically generated by the knowledge graph were highly consistent with results from the manual literature review. Out of 222 reference compounds, 163 (73%) ranked in the top 2000, with 547 out of the 644 (85%) weight points assigned to the reference compounds. For reference compounds that were not in the top of the list, an extensive error analysis has been performed. When evaluating the overall performance, we obtained a ROC-AUC of 0.974. Discussion Semantic knowledge graphs composed of information integrated from multiple and varying sources can assist researchers in identifying potential disease biomarkers

    Enhancing interoperability: ontology-mapping in an electronic institution

    Get PDF
    The automation of B2B processes requires a high level of interoperability between potentially disparate systems. We model such systems using software agents (representing enterprises), which interact using specific protocols. When considering open environments, interoperability problems are even more challenging. Addressing business automation as a task that intends to align businesses through a tight integration of processes may not be desirable, because business relationships may be temporary and dynamic. Furthermore, openness implies heterogeneity of technologies, processes, and even domain ontologies. After discussing these issues, this paper presents, in the context of an Electronic Institution, an ontology-mapping service that enables the automation of negotiation protocols when agents may use different ontologies to represent their domain knowledge. The ontology-mapping service employs two approaches used for lexical and semantic similarity, namely N-Grams and WordNet, and poses few requirements on the ontologies' representation format. Examples are provided that illustrate the integration of ontology-mapping with automated negotiation. © 2009 Springer Berlin Heidelberg

    An analysis-ready and quality controlled resource for pediatric brain white-matter research

    Get PDF
    We created a set of resources to enable research based on openly-available diffusion MRI (dMRI) data from the Healthy Brain Network (HBN) study. First, we curated the HBN dMRI data (N = 2747) into the Brain Imaging Data Structure and preprocessed it according to best-practices, including denoising and correcting for motion effects, susceptibility-related distortions, and eddy currents. Preprocessed, analysis-ready data was made openly available. Data quality plays a key role in the analysis of dMRI. To optimize QC and scale it to this large dataset, we trained a neural network through the combination of a small data subset scored by experts and a larger set scored by community scientists. The network performs QC highly concordant with that of experts on a held out set (ROC-AUC = 0.947). A further analysis of the neural network demonstrates that it relies on image features with relevance to QC. Altogether, this work both delivers resources to advance transdiagnostic research in brain connectivity and pediatric mental health, and establishes a novel paradigm for automated QC of large datasets

    Facilitating the development of controlled vocabularies for metabolomics technologies with text mining

    Get PDF
    BACKGROUND: Many bioinformatics applications rely on controlled vocabularies or ontologies to consistently interpret and seamlessly integrate information scattered across public resources. Experimental data sets from metabolomics studies need to be integrated with one another, but also with data produced by other types of omics studies in the spirit of systems biology, hence the pressing need for vocabularies and ontologies in metabolomics. However, it is time-consuming and non trivial to construct these resources manually. RESULTS: We describe a methodology for rapid development of controlled vocabularies, a study originally motivated by the needs for vocabularies describing metabolomics technologies. We present case studies involving two controlled vocabularies (for nuclear magnetic resonance spectroscopy and gas chromatography) whose development is currently underway as part of the Metabolomics Standards Initiative. The initial vocabularies were compiled manually, providing a total of 243 and 152 terms. A total of 5,699 and 2,612 new terms were acquired automatically from the literature. The analysis of the results showed that full-text articles (especially the Materials and Methods sections) are the major source of technology-specific terms as opposed to paper abstracts. CONCLUSIONS: We suggest a text mining method for efficient corpus-based term acquisition as a way of rapidly expanding a set of controlled vocabularies with the terms used in the scientific literature. We adopted an integrative approach, combining relatively generic software and data resources for time- and cost-effective development of a text mining tool for expansion of controlled vocabularies across various domains, as a practical alternative to both manual term collection and tailor-made named entity recognition methods

    ABSTRACT

    No full text
    Agent communication languages such as ACL and KQML provide a standard for agent communication. For the protocol and the language used in the communication, several standards are available. This is not the case for the ontology used in the communication. The ontology depends on the subject of the communication. Since the number of subjects is almost infinite and since the concepts used for a subject can be described by different ontologies, the development of generally accepted standards will take a long time. This lack of standardization, which hampers communication and collaboration between agents, is known as the interoperability problem. To overcome the interoperability problem, an approach that enables agents to learn a mapping between their ontologies will be proposed
    • …
    corecore