160,683 research outputs found

    CASP-DM: Context Aware Standard Process for Data Mining

    Get PDF
    We propose an extension of the Cross Industry Standard Process for Data Mining (CRISPDM) which addresses specific challenges of machine learning and data mining for context and model reuse handling. This new general context-aware process model is mapped with CRISP-DM reference model proposing some new or enhanced outputs

    An agile business process and practice meta-model

    Get PDF
    Business Process Management (BPM) encompasses the discovery, modelling, monitoring, analysis and improvement of business processes. Limitations of traditional BPM approaches in addressing changes in business requirements have resulted in a number of agile BPM approaches that seek to accelerate the redesign of business process models. Meta-models are a key BPM feature that reduce the ambiguity of business process models. This paper describes a meta-model supporting the agile version of the Business Process and Practice Alignment Methodology (BPPAM) for business process improvement, which captures process information from actual work practices. The ability of the meta-model to achieve business process agility is discussed and compared with other agile meta-models, based on definitions of business process flexibility and agility found in the literature. (C) 2017 The Authors. Published by Elsevier B.V

    Automated schema matching techniques: an exploratory study

    Get PDF
    Manual schema matching is a problem for many database applications that use multiple data sources including data warehousing and e-commerce applications. Current research attempts to address this problem by developing algorithms to automate aspects of the schema-matching task. In this paper, an approach using an external dictionary facilitates automated discovery of the semantic meaning of database schema terms. An experimental study was conducted to evaluate the performance and accuracy of five schema-matching techniques with the proposed approach, called SemMA. The proposed approach and results are compared with two existing semi-automated schema-matching approaches and suggestions for future research are made

    Integrating and Ranking Uncertain Scientific Data

    Get PDF
    Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates

    From Sensor to Observation Web with Environmental Enablers in the Future Internet

    Get PDF
    This paper outlines the grand challenges in global sustainability research and the objectives of the FP7 Future Internet PPP program within the Digital Agenda for Europe. Large user communities are generating significant amounts of valuable environmental observations at local and regional scales using the devices and services of the Future Internet. These communities’ environmental observations represent a wealth of information which is currently hardly used or used only in isolation and therefore in need of integration with other information sources. Indeed, this very integration will lead to a paradigm shift from a mere Sensor Web to an Observation Web with semantically enriched content emanating from sensors, environmental simulations and citizens. The paper also describes the research challenges to realize the Observation Web and the associated environmental enablers for the Future Internet. Such an environmental enabler could for instance be an electronic sensing device, a web-service application, or even a social networking group affording or facilitating the capability of the Future Internet applications to consume, produce, and use environmental observations in cross-domain applications. The term ?envirofied? Future Internet is coined to describe this overall target that forms a cornerstone of work in the Environmental Usage Area within the Future Internet PPP program. Relevant trends described in the paper are the usage of ubiquitous sensors (anywhere), the provision and generation of information by citizens, and the convergence of real and virtual realities to convey understanding of environmental observations. The paper addresses the technical challenges in the Environmental Usage Area and the need for designing multi-style service oriented architecture. Key topics are the mapping of requirements to capabilities, providing scalability and robustness with implementing context aware information retrieval. Another essential research topic is handling data fusion and model based computation, and the related propagation of information uncertainty. Approaches to security, standardization and harmonization, all essential for sustainable solutions, are summarized from the perspective of the Environmental Usage Area. The paper concludes with an overview of emerging, high impact applications in the environmental areas concerning land ecosystems (biodiversity), air quality (atmospheric conditions) and water ecosystems (marine asset management)
    • 

    corecore