13,845 research outputs found

    Classification Using Association Rules

    Get PDF
    This research investigates the use of an unsupervised learning technique, association rules, to make class predictions. The use of association rules to make class predictions is a growing area of focus within data mining research. The research to date has focused predominately on balanced datasets or synthetized imbalanced datasets. There have been concerns raised that the algorithms using association rules to make classifications do not perform well on imbalanced datasets. This research comprehensively evaluates the accuracy of a number of association rule classifiers in predicting home loan sales in an Irish retail banking context. The experiments designed test three associative classifier algorithms CBA, CMAR and SPARCCC against two benchmark algorithms conditional inference trees and random forests on a naturally imbalanced dataset. The experiments implemented and evaluated show that the benchmark tree based algorithms conditional inference trees and random forests outperform the associative classifier models across a range of balanced accuracy measures. This research contributes to the growing body of research in extending association rules to make class prediction

    Big data analytics for preventive medicine

    Get PDF
    © 2019, Springer-Verlag London Ltd., part of Springer Nature. Medical data is one of the most rewarding and yet most complicated data to analyze. How can healthcare providers use modern data analytics tools and technologies to analyze and create value from complex data? Data analytics, with its promise to efficiently discover valuable pattern by analyzing large amount of unstructured, heterogeneous, non-standard and incomplete healthcare data. It does not only forecast but also helps in decision making and is increasingly noticed as breakthrough in ongoing advancement with the goal is to improve the quality of patient care and reduces the healthcare cost. The aim of this study is to provide a comprehensive and structured overview of extensive research on the advancement of data analytics methods for disease prevention. This review first introduces disease prevention and its challenges followed by traditional prevention methodologies. We summarize state-of-the-art data analytics algorithms used for classification of disease, clustering (unusually high incidence of a particular disease), anomalies detection (detection of disease) and association as well as their respective advantages, drawbacks and guidelines for selection of specific model followed by discussion on recent development and successful application of disease prevention methods. The article concludes with open research challenges and recommendations

    Empirical models, rules, and optimization

    Get PDF
    This paper considers supply decisions by firms in a dynamic setting with adjustment costs and compares the behavior of an optimal control model to that of a rule-based system which relaxes the assumption that agents are explicit optimizers. In our approach, the economic agent uses believably simple rules in coping with complex situations. We estimate rules using an artificially generated sample obtained by running repeated simulations of a dynamic optimal control model of a firm's hiring/firing decisions. We show that (i) agents using heuristics can behave as if they were seeking rationally to maximize their dynamic returns; (ii) the approach requires fewer behavioral assumptions relative to dynamic optimization and the assumptions made are based on economically intuitive theoretical results linking rule adoption to uncertainty; (iii) the approach delineates the domain of applicability of maximization hypotheses and describes the behavior of agents in situations of economic disequilibrium. The approach adopted uses concepts from fuzzy control theory. An agent, instead of optimizing, follows Fuzzy Associative Memory (FAM) rules which, given input and output data, can be estimated and used to approximate any non-linear dynamic process. Empirical results indicate that the fuzzy rule-based system performs extremely well in approximating optimal dynamic behavior in situations with limited noise.Decision-making. ,econometric models ,TMD ,

    k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)

    Get PDF
    Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier -- classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data. This paper is the second edition of a paper previously published as a technical report. Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    Intelligent system for associative pattern identification in data

    Get PDF
    Mestrado em Gestão de Sistemas de InformaçãoOs resultados das ferramentas estatísticas são baseados em resultados numéricos onde a interpretação e compreensão do que está gerado passa pelo intérprete que está a analisar os resultados. Esta tarefa de compreensão é muitas vezes complicada por vários fatores sendo um dos quais o facto do intérprete não conseguir captar dos resultados o que é relevante para avaliar o modelo formulado, não conseguindo avalia-lo como válido ou não, o que poderá levar à utilização de modelos que podem ser descabidos e sem fundamento. Com esta ideia em consideração foi desenvolvido, em ambiente Linux, um pequeno sistema com técnicas de data mining de carácter associativo. Neste sistema é gerado um relatório por cada modelo, onde são analisados os fatores mais relevantes para a criação de modelos, guiando desta forma o intérprete a decidir validar e utilizar o modelo criado ou a rejeitá-lo. O objetivo deste trabalho passou pela aprendizagem da linguagem Python aplicado a dados, uma aprendizagem aprofundada sobre data mining, as técnicas e métodos existentes e uma verificação das ferramentas de machine learning, de modo a criar como produto final um sistema com algumas técnicas. Foi possível a realização do trabalho proposto com a criação do sistema. Foram formulados métodos para produzir um modelo de regressão linear múltipla, regressão logística, um modelo de correlação linear e um modelo de regras de associação. Para três modelos foram gerados métodos tendo por base bibliotecas e machine learning. Para as regras de associação foi criado um método de raiz baseado no algoritmo FP-Growth.For many people statistics is a difficult task to be done, where the output that is given from the analytic tools can be complicated to understand. With this idea it was investigated the possibility of creation of a system that provides the creation some models to the users where is provided some guidelines about the most important values to take care for each model. The goals of this project are the development of the knowledge about Data Mining, learn how to use Python to produce data analysis, verify the existent machine learning applied to data for Python and use some data mining techniques to create a small system for associative models. The system is capable to perform a Linear Regression, a Logistic Regression, a Correlation Coefficient and an Association Rule Mining algorithm. For each method is provided an output that contains the numerical results of the method and it was produce some guidelines with general ideas, assumptions of each method and it is interpreted the most important statistical values to facilitate the understanding of all the methods. The system was developed in Python. Three methods were created are based on machine learning algorithms. The association rule mining algorithm was created from the beginning. The association rule mining algorithm developed was FP-growth. The system was ready to run in Linux.info:eu-repo/semantics/publishedVersio
    corecore