Search CORE

5,678 research outputs found

Mining semantic rules for optimizing transport assignments in hospitals

Author: Bonte Pieter
De Turck Filip
Hoogstoel Eveline
Ongenae Femke
Publication venue
Publication date: 01/01/2016
Field of study

Efficient chain structure for high-utility sequential pattern mining

Author: Djenouri Youcef
Fournier-Viger Philippe
Li Yuanfa
Lin Jerry Chun-Wei
Zhang Ji
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

High-utility sequential pattern mining (HUSPM) is an emerging topic in data mining, which considers both utility and sequence factors to derive the set of high-utility sequential patterns (HUSPs) from the quantitative databases. Several works have been presented to reduce the computational cost by variants of pruning strategies. In this paper, we present an efficient sequence-utility (SU)-chain structure, which can be used to store more relevant information to improve mining performance. Based on the SU-Chain structure, the existing pruning strategies can also be utilized here to early prune the unpromising candidates and obtain the satisfied HUSPs. Experiments are then compared with the state-of-the-art HUSPM algorithms and the results showed that the SU-Chain-based model can efficiently improve the efficiency performance than the existing HUSPM algorithms in terms of runtime and number of the determined candidates

SINTEF Open

NORA - Norwegian Open Research Archives

University of Southern Queensland ePrints

I-prune: Item selection for associative classification

Author: Baralis
Coenen
Coenen
Guyon
Hall
Li
Quinlan
Rak
Tan
Wang
Wang
Zaïane
Publication venue: John Wiley & Sons, Inc.
Publication date: 01/01/2012
Field of study

Associative classification is characterized by accurate models and high model generation time. Most time is spent in extracting and postprocessing a large set of irrelevant rules, which are eventually pruned.We propose I-prune, an item-pruning approach that selects uninteresting items by means of an interestingness measure and prunes them as soon as they are detected. Thus, the number of extracted rules is reduced and model generation time decreases correspondingly. A wide set of experiments on real and synthetic data sets has been performed to evaluate I-prune and select the appropriate interestingness measure. The experimental results show that I-prune allows a significant reduction in model generation time, while increasing (or at worst preserving) model accuracy. Experimental evaluation also points to the chi-square measure as the most effective interestingness measure for item pruning

Crossref

Archivio istituzionale della ricerca - Politecnico di Milano

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Generating High Precision Classification Rules for Screening of Irrelevant Studies in Systematic Review Literature Searches

Author: Petersen Henry
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2016
Field of study

Systematic reviews aim to produce repeatable, unbiased, and comprehensive answers to clinical questions. Systematic reviews are an essential component of modern evidence based medicine, however due to the risks of omitting relevant research they are highly time consuming to create and are largely conducted manually. This thesis presents a novel framework for partial automation of systematic review literature searches. We exploit the ubiquitous multi-stage screening process by training the classifier using annotations made by reviewers in previous screening stages. Our approach has the benefit of integrating seamlessly with the existing screening process, minimising disruption to users. Ideally, classification models for systematic reviews should be easily interpretable by users. We propose a novel, rule based algorithm for use with our framework. A new approach for identifying redundant associations when generating rules is also presented. The proposed approach to redundancy seeks to both exclude redundant specialisations of existing rules (those with additional terms in their antecedent), as well as redundant generalisations (those with fewer terms in their antecedent). We demonstrate the ability of the proposed approach to improve the usability of the generated rules. The proposed rule based algorithm is evaluated by simulated application to several existing systematic reviews. Workload savings of up to 10% are demonstrated. There is an increasing demand for systematic reviews related to a variety of clinical disciplines, such as diagnosis. We examine reviews of diagnosis and contrast them against more traditional systematic reviews of treatment. We demonstrate existing challenges such as target class heterogeneity and high data imbalance are even more pronounced for this class of reviews. The described algorithm accounts for this by seeking to label subsets of non-relevant studies with high precision, avoiding the need to generate a high recall model of the minority class

Sydney eScholarship