9 research outputs found

    EXACT2: the semantics of biomedical protocols

    Get PDF
    © 2014 Soldatova et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.This article has been made available through the Brunel Open Access Publishing Fund.Background: The reliability and reproducibility of experimental procedures is a cornerstone of scientific practice. There is a pressing technological need for the better representation of biomedical protocols to enable other agents (human or machine) to better reproduce results. A framework that ensures that all information required for the replication of experimental protocols is essential to achieve reproducibility. Methods: We have developed the ontology EXACT2 (EXperimental ACTions) that is designed to capture the full semantics of biomedical protocols required for their reproducibility. To construct EXACT2 we manually inspected hundreds of published and commercial biomedical protocols from several areas of biomedicine. After establishing a clear pattern for extracting the required information we utilized text-mining tools to translate the protocols into a machine amenable format. We have verified the utility of EXACT2 through the successful processing of previously ‘unseen’ (not used for the construction of EXACT2) protocols. Results: The paper reports on a fundamentally new version EXACT2 that supports the semantically-defined representation of biomedical protocols. The ability of EXACT2 to capture the semantics of biomedical procedures was verified through a text mining use case. In this EXACT2 is used as a reference model for text mining tools to identify terms pertinent to experimental actions, and their properties, in biomedical protocols expressed in natural language. An EXACT2-based framework for the translation of biomedical protocols to a machine amenable format is proposed. Conclusions: The EXACT2 ontology is sufficient to record, in a machine processable form, the essential information about biomedical protocols. EXACT2 defines explicit semantics of experimental actions, and can be used by various computer applications. It can serve as a reference model for for the translation of biomedical protocols in natural language into a semantically-defined format.This work has been partially funded by the Brunel University BRIEF award and a grant from Occams Resources

    Community-driven development for computational biology at Sprints, Hackathons and Codefests

    Get PDF
    Background: Computational biology comprises a wide range of technologies and approaches. Multiple technologies can be combined to create more powerful workflows if the individuals contributing the data or providing tools for its interpretation can find mutual understanding and consensus. Much conversation and joint investigation are required in order to identify and implement the best approaches. Traditionally, scientific conferences feature talks presenting novel technologies or insights, followed up by informal discussions during coffee breaks. In multi-institution collaborations, in order to reach agreement on implementation details or to transfer deeper insights in a technology and practical skills, a representative of one group typically visits the other. However, this does not scale well when the number of technologies or research groups is large. Conferences have responded to this issue by introducing Birds-of-a-Feather (BoF) sessions, which offer an opportunity for individuals with common interests to intensify their interaction. However, parallel BoF sessions often make it hard for participants to join multiple BoFs and find common ground between the different technologies, and BoFs are generally too short to allow time for participants to program together. Results: This report summarises our experience with computational biology Codefests, Hackathons and Sprints, which are interactive developer meetings. They are structured to reduce the limitations of traditional scientific meetings described above by strengthening the interaction among peers and letting the participants determine the schedule and topics. These meetings are commonly run as loosely scheduled "unconferences" (self-organized identification of participants and topics for meetings) over at least two days, with early introductory talks to welcome and organize contributors, followed by intensive collaborative coding sessions. We summarise some prominent achievements of those meetings and describe differences in how these are organised, how their audience is addressed, and their outreach to their respective communities. Conclusions: Hackathons, Codefests and Sprints share a stimulating atmosphere that encourages participants to jointly brainstorm and tackle problems of shared interest in a self-driven proactive environment, as well as providing an opportunity for new participants to get involved in collaborative projects

    A Personal Research Agent for Semantic Knowledge Management of Scientific Literature

    Get PDF
    The unprecedented rate of scientific publications is a major threat to the productivity of knowledge workers, who rely on scrutinizing the latest scientific discoveries for their daily tasks. Online digital libraries, academic publishing databases and open access repositories grant access to a plethora of information that can overwhelm a researcher, who is looking to obtain fine-grained knowledge relevant for her task at hand. This overload of information has encouraged researchers from various disciplines to look for new approaches in extracting, organizing, and managing knowledge from the immense amount of available literature in ever-growing repositories. In this dissertation, we introduce a Personal Research Agent that can help scientists in discovering, reading and learning from scientific documents, primarily in the computer science domain. We demonstrate how a confluence of techniques from the Natural Language Processing and Semantic Web domains can construct a semantically-rich knowledge base, based on an inter-connected graph of scholarly artifacts – effectively transforming scientific literature from written content in isolation, into a queryable web of knowledge, suitable for machine interpretation. The challenges of creating an intelligent research agent are manifold: The agent's knowledge base, analogous to his 'brain', must contain accurate information about the knowledge `stored' in documents. It also needs to know about its end-users' tasks and background knowledge. In our work, we present a methodology to extract the rhetorical structure (e.g., claims and contributions) of scholarly documents. We enhance our approach with entity linking techniques that allow us to connect the documents with the Linked Open Data (LOD) cloud, in order to enrich them with additional information from the web of open data. Furthermore, we devise a novel approach for automatic profiling of scholarly users, thereby, enabling the agent to personalize its services, based on a user's background knowledge and interests. We demonstrate how we can automatically create a semantic vector-based representation of the documents and user profiles and utilize them to efficiently detect similar entities in the knowledge base. Finally, as part of our contributions, we present a complete architecture providing an end-to-end workflow for the agent to exploit the opportunities of linking a formal model of scholarly users and scientific publications
    corecore