Search CORE

59,709 research outputs found

Robust and cost-effective approach for discovering action rules

Author: Kalanat N
Saraee MH
Shamsinejad P
Publication venue: 'IACSIT Press'
Publication date: 01/01/2011
Field of study

The main goal of Knowledge Discovery in Databases is to find interesting and usable patterns, meaningful in their domain. Actionable Knowledge Discovery came to existence as a direct respond to the need of finding more usable patterns called actionable patterns. Traditional data mining and algorithms are often confined to deliver frequent patterns and come short for suggesting how to make these patterns actionable. In this scenario the users are expected to act. However, the users are not advised about what to do with delivered patterns in order to make them usable. In this paper, we present an automated approach to focus on not only creating rules but also making the discovered rules actionable. Up to now few works have been reported in this field which lacking incomprehensibility to the user, overlooking the cost and not providing rule generality. Here we attempt to present a method to resolving these issues. In this paper CEARDM method is proposed to discover cost-effective action rules from data. These rules offer some cost-effective changes to transferring low profitable instances to higher profitable ones. We also propose an idea for improving in CEARDM method

University of Salford Institutional Repository

Crossref

A probabilistic ontology-based platform for self-learning context-aware healthcare applications

Author: Claeys Maxim
De Turck Filip
Dhaene Tom
Dupont Thomas
Kerckhove Wannes
Ongenae Femke
Verhoeve P
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Ghent University Academic Bibliography

Rough sets theory and uncertainty into information system

Author: Jirava Pavel
Publication venue: 'Univerzita Pardubice'
Publication date: 01/01/2006
Field of study

This article is focused on rough sets approach to expression of uncertainty into information system. We assume that the data are presented in the decision table and that some attribute values are lost. At first the theoretical background is described and after that, computations on real-life data are presented. In computation we wok with uncertainty coming from missing attribute values

Digital Library of the University of Pardubice

Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

Author: A Arcuri
AL Rector
AM Wood
AS Glas
B Kulis
C Cortes
C Sammut
CC Diamond
CD Kidd
CR MacIntyre
DP Lewis
E Koumoundouros
E Rahm
EM Knorr
ES Fisher
GE Box
GM Weber
H Carter
H He
H Meyer
H Quan
HH Hoos
I Yoo
J Andreu-Perez
J Fan
J Zhao
JD Lafferty
JM Bland
JW Graham
K Lange
KP Murphy
LA King
LM Collins
M Azarm-Daigle
M Kantardzic
M Sokolova
MA Stoto
N Oreskes
PB Jensen
PK Lindenauer
PM Visscher
RJ Little
V López
V Sessions
VN Vapnik
W Raghupathi
Y Luo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/01/2018
Field of study

From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

arXiv.org e-Print Archive

Crossref

Constraint-based Sequential Pattern Mining with Decision Diagrams

Author: Cire Andre A.
Hosseininasab Amin
van Hoeve Willem-Jan
Publication venue
Publication date: 14/11/2018
Field of study

Constrained sequential pattern mining aims at identifying frequent patterns on a sequential database of items while observing constraints defined over the item attributes. We introduce novel techniques for constraint-based sequential pattern mining that rely on a multi-valued decision diagram representation of the database. Specifically, our representation can accommodate multiple item attributes and various constraint types, including a number of non-monotone constraints. To evaluate the applicability of our approach, we develop an MDD-based prefix-projection algorithm and compare its performance against a typical generate-and-check variant, as well as a state-of-the-art constraint-based sequential pattern mining algorithm. Results show that our approach is competitive with or superior to these other methods in terms of scalability and efficiency.Comment: AAAI201

arXiv.org e-Print Archive

University of Toronto Research Repository

Association for the Advancement of Artificial Intelligence: AAAI Publications

HANDLING MISSING ATTRIBUTE VALUES IN DECISION TABLES USING VALUED TOLERANCE APPROACH

Author: Vasudevan Supriya
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2008
Field of study

Rule induction is one of the key areas in data mining as it is applied to a large number of real life data. However, in such real life data, the information is incompletely specified most of the time. To induce rules from these incomplete data, more powerful algorithms are necessary. This research work mainly focuses on a probabilistic approach based on the valued tolerance relation. This thesis is divided into two parts. The first part describes the implementation of the valued tolerance relation. The induced rules are then evaluated based on the error rate due to incorrectly classified and unclassified examples. The second part of this research work shows a comparison of the rules induced by the MLEM2 algorithm that has been implemented before, with the rules induced by the valued tolerance based approach which was implemented as part of this research. Hence, through this thesis, the error rate for the MLEM2 algorithm and the valued tolerance based approach are compared and the results are documented

KU ScholarWorks

ACon: A learning-based approach to deal with uncertainty in contextual requirements at runtime

Author: Damian Daniela
Franch Gutiérrez Javier
Knauss Alessia
Müller Haussi A.
Rook Angela
Thomo Alex
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Context: Runtime uncertainty such as unpredictable operational environment and failure of sensors that gather environmental data is a well-known challenge for adaptive systems. Objective: To execute requirements that depend on context correctly, the system needs up-to-date knowledge about the context relevant to such requirements. Techniques to cope with uncertainty in contextual requirements are currently underrepresented. In this paper we present ACon (Adaptation of Contextual requirements), a data-mining approach to deal with runtime uncertainty affecting contextual requirements. Method: ACon uses feedback loops to maintain up-to-date knowledge about contextual requirements based on current context information in which contextual requirements are valid at runtime. Upon detecting that contextual requirements are affected by runtime uncertainty, ACon analyses and mines contextual data, to (re-)operationalize context and therefore update the information about contextual requirements. Results: We evaluate ACon in an empirical study of an activity scheduling system used by a crew of 4 rowers in a wild and unpredictable environment using a complex monitoring infrastructure. Our study focused on evaluating the data mining part of ACon and analysed the sensor data collected onboard from 46 sensors and 90,748 measurements per sensor. Conclusion: ACon is an important step in dealing with uncertainty affecting contextual requirements at runtime while considering end-user interaction. ACon supports systems in analysing the environment to adapt contextual requirements and complements existing requirements monitoring approaches by keeping the requirements monitoring specification up-to-date. Consequently, it avoids manual analysis that is usually costly in today’s complex system environments.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

On the role of pre and post-processing in environmental data mining

Author: Athanasiadis Ioannis
Comas Joaquim
Gibert Karina
Holmes Geoffrey
Izquierdo Joaquin
Sanchez-Marre Miquel
Publication venue: International Environmental Modelling and Software Society
Publication date: 01/01/2008
Field of study

The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

Research Commons@Waikato