13 research outputs found

    Evaluation and optimization of frequent association rule based classification

    Get PDF
    Deriving useful and interesting rules from a data mining system is an essential and important task. Problems such as the discovery of random and coincidental patterns or patterns with no significant values, and the generation of a large volume of rules from a database commonly occur. Works on sustaining the interestingness of rules generated by data mining algorithms are actively and constantly being examined and developed. In this paper, a systematic way to evaluate the association rules discovered from frequent itemset mining algorithms, combining common data mining and statistical interestingness measures, and outline an appropriated sequence of usage is presented. The experiments are performed using a number of real-world datasets that represent diverse characteristics of data/items, and detailed evaluation of rule sets is provided. Empirical results show that with a proper combination of data mining and statistical analysis, the framework is capable of eliminating a large number of non-significant, redundant and contradictive rules while preserving relatively valuable high accuracy and coverage rules when used in the classification problem. Moreover, the results reveal the important characteristics of mining frequent itemsets, and the impact of confidence measure for the classification task

    Irrelevant feature and rule removal for structural associative classification

    Get PDF
    In the classification task, the presence of irrelevant features can significantly degrade the performance of classification algorithms,in terms of additional processing time, more complex models and the likelihood that the models have poor generalization power due to the over fitting problem.Practical applications of association rule mining often suffer from overwhelming number of rules that are generated, many of which are not interesting or not useful for the application in question.Removing rules comprised of irrelevant features can significantly improve the overall performance.In this paper, we explore and compare the use of a feature selection measure to filter out unnecessary and irrelevant features/attributes prior to association rules generation.The experiments are performed using a number of real-world datasets that represent diverse characteristics of data items.Empirical results confirm that by utilizing feature subset selection prior to association rule generation, a large number of rules with irrelevant features can be eliminated.More importantly, the results reveal that removing rules that hold irrelevant features improve the accuracy rate and capability to retain the rule coverage rate of structural associative association

    Quality and interestingness of association rules derived from data mining of relational and semi-structured data

    Get PDF
    Deriving useful and interesting rules from a data mining system are essential and important tasks. Problems such as the discovery of random and coincidental patterns or patterns with no significant values, and the generation of a large volume of rules from a database commonly occur. Works on sustaining the interestingness of rules generated by data mining algorithms are actively and constantly being examined and developed. As the data mining techniques are data-driven, it is beneficial to affirm the rules using a statistical approach. It is important to establish the ways in which the existing statistical measures and constraint parameters can be effectively utilized and the sequence of their usage.In this thesis, a systematic way to evaluate the association rules discovered from frequent, closed and maximal itemset mining algorithms; and frequent subtree mining algorithm including the rules based on induced, embedded and disconnected subtrees is presented. With reference to the frequent subtree mining, in addition a new direction is explored based on utilizing the DSM approach capable of preserving all information from tree-structured database in a flat data format, consequently enabling the direct application of a wider range of data mining analysis/techniques to tree-structured data. Implications of this approach were investigated and it was found that basing rules on disconnected subtrees, can be useful in terms of increasing the accuracy and the coverage rate of the rule set.A strategy that combines data mining and statistical measurement techniques such as sampling, redundancy and contradictive checks, correlation and regression analysis to evaluate the rules is developed. This framework is then applied to real-world datasets that represent diverse characteristics of data/items. Empirical results show that with a proper combination of data mining and statistical analysis, the proposed framework is capable of eliminating a large number of non-significant, redundant and contradictive rules while preserving relatively valuable high accuracy rules. Moreover, the results reveal the important characteristics and differences between mining frequent, closed or maximal itemsets; and mining frequent subtree including the rules based on induced, embedded and disconnected subtrees; as well as the impact of confidence measure for the prediction and classification task

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe

    Interestingness of association rules using symmetrical tau and logistic regression

    No full text
    While association rule mining is one of the most popular data mining techniques, it usually results in many rules, some of which are not considered as interesting or significant for the application at hand. In this paper, we conduct a systematic approach to ascertain the discovered rules and provide a rigorous statistical approach supporting this framework. The strategy proposed combines data mining and statistical measurement techniques, including redundancy analysis, sampling and multivariate statistical analysis, to discard the non significant rules. A real world dataset is used to demonstrate how the proposed unified framework can discard many of the redundant or non significant rules and still preserve high accuracy of the rule set as a whole

    Dispelling illusions of truth : exploring the factors that lead to inflated truth judgements

    Get PDF
    Judging the truth of incoming information is one of the most challenging and important tasks that people face every day. How do people decide what is true and what is not? When constructing truth judgements, people use both declarative information and the subtler cues that accompany information processing. These subtle, non-content-based cues that make information feel truer are termed “truth effects”. This thesis uses trivia statements to investigate the robustness of two such non-probative truth effects driven by repetition (the illusory truth effect) and concrete language (the linguistic concreteness effect). Neither concreteness nor repetition provide substantive evidence, yet people believe repeated statements more than new ones, and concretely worded statements feel truer than their abstract counterparts. Truth effects can have direct implications in our digital world, where information may be spurious, and communicators can enlist subtle cues to persuade the addressee without detection. Throughout the thesis I apply open methods that have the potential to increase the quality, replicability, and transparency of research. In Chapter 2, I set out to replicate and extend the linguistic concreteness effect. Across two experiments I did not observe an effect larger than the smallest effect size of interest. Therefore the remainder of the thesis focuses on the illusory truth effect. Chapter 3 uses systematic mapping to synthesise and catalogue the entire illusory truth literature in terms of methods, findings, and transparency. The results reveal a lack of standardisation in the methodology employed, and of transparency in reporting. I also find that greater diversity of stimuli and participants is required for generalisability. In Chapter 4, my final study used a longitudinal design to test whether the delay between repetitions moderates illusory truth. Contrary to previous claims, I find that across four intervals (immediately, one day, one week, one month) the effect diminishes as delay increases. This thesis contributes to knowledge by providing an overview of the current state of truth effects research. It demonstrates that there is considerable cause to doubt the existence of a linguistic concreteness effect, and by implication, there is reason to be sceptical about other truth effects based on subtle manipulations. In contrast, this thesis establishes confidence that the illusory truth effect is robust but reduces with time. This finding has implications for the mechanisms thought to underlie truth effects. Overall, the results suggest that when truth effects research uses rigorous, transparent, and unbiased methods, it paints a different picture from that of the existing literature

    Attention Restraint, Working Memory Capacity, and Mind Wandering: Do Emotional Valence or Intentionality Matter?

    Get PDF
    Attention restraint appears to mediate the relationship between working memory capacity (WMC) and mind wandering (Kane et al., 2016). Prior work has identifed two dimensions of mind wandering—emotional valence and intentionality. However, less is known about how WMC and attention restraint correlate with these dimensions. Te current study examined the relationship between WMC, attention restraint, and mind wandering by emotional valence and intentionality. A confrmatory factor analysis demonstrated that WMC and attention restraint were strongly correlated, but only attention restraint was related to overall mind wandering, consistent with prior fndings. However, when examining the emotional valence of mind wandering, attention restraint and WMC were related to negatively and positively valenced, but not neutral, mind wandering. Attention restraint was also related to intentional but not unintentional mind wandering. Tese results suggest that WMC and attention restraint predict some, but not all, types of mind wandering

    Sustainable Development of Real Estate

    Get PDF
    Research, theoretical and practical tasks of sustainable real estate development process are revised in detail in this monograph; particular examples are presented as well. The concept of modern real estate development model and a developer is discussed, peculiarities of the development of built environment and real estate objects are analyzed, as well as assessment methods, models and management of real estate and investments in order to increase the object value. Theoretical and practical analyses, presented in the monograph, prove that intelligent and augmented reality technologies allow business managers to reach higher results in work quality, organize a creative team of developers, which shall present more qualitative products for the society. The edition presents knowledge on economic, legal, technological, technical, organizational, social, cultural, ethical, psychological and environmental, as well as its management aspects, which are important for the development of real estate: publicly admitted sustainable development principles, urban development and aesthetic values, territory planning, participation of society and heritage protection. It is admitted that economical crises are inevitable, and the provided methods shall help to decrease possible loss. References to the most modern world scientific literature sources are presented in the monograph. The monograph is prepared for the researchers, MSc and PhD students of construction economics and real estate development. The book may be useful for other researchers, MSc and PhD students of economics, management and other specialities, as well as business specialist of real estate business. The publication of monograph was funded by European Social Fund according to project No. VP1-2.2-ŠMM-07-K-02-060 Development and Implementation of Joint Master’s Study Programme “Sustainable Development of the Built Environment”

    Latent variable methods for visualization through time

    Get PDF
    corecore