7,633 research outputs found

    An Incremental Learning Method to Support the Annotation of Workflows with Data-to-Data Relations

    Get PDF
    Workflow formalisations are often focused on the representation of a process with the primary objective to support execution. However, there are scenarios where what needs to be represented is the effect of the process on the data artefacts involved, for example when reasoning over the corresponding data policies. This can be achieved by annotating the workflow with the semantic relations that occur between these data artefacts. However, manually producing such annotations is difficult and time consuming. In this paper we introduce a method based on recommendations to support users in this task. Our approach is centred on an incremental rule association mining technique that allows to compensate the cold start problem due to the lack of a training set of annotated workflows. We discuss the implementation of a tool relying on this approach and how its application on an existing repository of workflows effectively enable the generation of such annotations

    Two-phased knowledge formalisation for hydrometallurgical gold ore process recommendation and validation

    Get PDF
    This paper describes an approach to externalising and formalising expert knowledge involved in the design and evaluation of hydrometallurgical process chains for gold ore treatment. The objective was to create a case-based reasoning application for recommending and validating a treatment process of gold ores. We describe a twofold approach. Formalising human expert knowledge about gold mining situations enables the retrieval of similar mining contexts and respective process chains, based on prospection data gathered from a potential gold mining site. Secondly, empirical knowledge on hydrometallurgical treatments is formalised. This enabled us to evaluate and, where needed, redesign the process chain that was recommended by the first aspect of our approach. The main problems with formalisation of knowledge in the domain of gold ore refinement are the diversity and the amount of parameters used in literature and by experts to describe a mining context. We demonstrate how similarity knowledge was used to formalise literature knowledge. The evaluation of data gathered from experiments with an initial prototype workflow recommender, Auric Adviser, provides promising results

    Ontology of core data mining entities

    Get PDF
    In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

    Integration of BPM systems

    Get PDF
    New technologies have emerged to support the global economy where for instance suppliers, manufactures and retailers are working together in order to minimise the cost and maximise efficiency. One of the technologies that has become a buzz word for many businesses is business process management or BPM. A business process comprises activities and tasks, the resources required to perform each task, and the business rules linking these activities and tasks. The tasks may be performed by human and/or machine actors. Workflow provides a way of describing the order of execution and the dependent relationships between the constituting activities of short or long running processes. Workflow allows businesses to capture not only the information but also the processes that transform the information - the process asset (Koulopoulos, T. M., 1995). Applications which involve automated, human-centric and collaborative processes across organisations are inherently different from one organisation to another. Even within the same organisation but over time, applications are adapted as ongoing change to the business processes is seen as the norm in today’s dynamic business environment. The major difference lies in the specifics of business processes which are changing rapidly in order to match the way in which businesses operate. In this chapter we introduce and discuss Business Process Management (BPM) with a focus on the integration of heterogeneous BPM systems across multiple organisations. We identify the problems and the main challenges not only with regards to technologies but also in the social and cultural context. We also discuss the issues that have arisen in our bid to find the solutions

    EXACT2: the semantics of biomedical protocols

    Get PDF
    © 2014 Soldatova et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.This article has been made available through the Brunel Open Access Publishing Fund.Background: The reliability and reproducibility of experimental procedures is a cornerstone of scientific practice. There is a pressing technological need for the better representation of biomedical protocols to enable other agents (human or machine) to better reproduce results. A framework that ensures that all information required for the replication of experimental protocols is essential to achieve reproducibility. Methods: We have developed the ontology EXACT2 (EXperimental ACTions) that is designed to capture the full semantics of biomedical protocols required for their reproducibility. To construct EXACT2 we manually inspected hundreds of published and commercial biomedical protocols from several areas of biomedicine. After establishing a clear pattern for extracting the required information we utilized text-mining tools to translate the protocols into a machine amenable format. We have verified the utility of EXACT2 through the successful processing of previously ‘unseen’ (not used for the construction of EXACT2) protocols. Results: The paper reports on a fundamentally new version EXACT2 that supports the semantically-defined representation of biomedical protocols. The ability of EXACT2 to capture the semantics of biomedical procedures was verified through a text mining use case. In this EXACT2 is used as a reference model for text mining tools to identify terms pertinent to experimental actions, and their properties, in biomedical protocols expressed in natural language. An EXACT2-based framework for the translation of biomedical protocols to a machine amenable format is proposed. Conclusions: The EXACT2 ontology is sufficient to record, in a machine processable form, the essential information about biomedical protocols. EXACT2 defines explicit semantics of experimental actions, and can be used by various computer applications. It can serve as a reference model for for the translation of biomedical protocols in natural language into a semantically-defined format.This work has been partially funded by the Brunel University BRIEF award and a grant from Occams Resources

    PRESISTANT: Learning based assistant for data pre-processing

    Get PDF
    Data pre-processing is one of the most time consuming and relevant steps in a data analysis process (e.g., classification task). A given data pre-processing operator (e.g., transformation) can have positive, negative or zero impact on the final result of the analysis. Expert users have the required knowledge to find the right pre-processing operators. However, when it comes to non-experts, they are overwhelmed by the amount of pre-processing operators and it is challenging for them to find operators that would positively impact their analysis (e.g., increase the predictive accuracy of a classifier). Existing solutions either assume that users have expert knowledge, or they recommend pre-processing operators that are only "syntactically" applicable to a dataset, without taking into account their impact on the final analysis. In this work, we aim at providing assistance to non-expert users by recommending data pre-processing operators that are ranked according to their impact on the final analysis. We developed a tool PRESISTANT, that uses Random Forests to learn the impact of pre-processing operators on the performance (e.g., predictive accuracy) of 5 different classification algorithms, such as J48, Naive Bayes, PART, Logistic Regression, and Nearest Neighbor. Extensive evaluations on the recommendations provided by our tool, show that PRESISTANT can effectively help non-experts in order to achieve improved results in their analytical tasks
    corecore