Search CORE

5,790 research outputs found

Testing Interestingness Measures in Practice: A Large-Scale Analysis of Buying Patterns

Author: Amer-Yahia Sihem
Kirchgessner Martin
Leroy Vincent
Mishra Shashwat
Publication venue
Publication date: 15/03/2016
Field of study

Understanding customer buying patterns is of great interest to the retail industry and has shown to benefit a wide variety of goals ranging from managing stocks to implementing loyalty programs. Association rule mining is a common technique for extracting correlations such as "people in the South of France buy ros\'e wine" or "customers who buy pat\'e also buy salted butter and sour bread." Unfortunately, sifting through a high number of buying patterns is not useful in practice, because of the predominance of popular products in the top rules. As a result, a number of "interestingness" measures (over 30) have been proposed to rank rules. However, there is no agreement on which measures are more appropriate for retail data. Moreover, since pattern mining algorithms output thousands of association rules for each product, the ability for an analyst to rely on ranking measures to identify the most interesting ones is crucial. In this paper, we develop CAPA (Comparative Analysis of PAtterns), a framework that provides analysts with the ability to compare the outcome of interestingness measures applied to buying patterns in the retail industry. We report on how we used CAPA to compare 34 measures applied to over 1,800 stores of Intermarch\'e, one of the largest food retailers in France

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

arules - A Computational Environment for Mining Association Rules and Frequent Item Sets

Author: Bettina Grün
Kurt Hornik
Michael Hahsler
Publication venue
Publication date
Field of study

Mining frequent itemsets and association rules is a popular and well researched approach for discovering interesting relationships between variables in large databases. The R package arules presented in this paper provides a basic infrastructure for creating and manipulating input data sets and for analyzing the resulting itemsets and rules. The package also includes interfaces to two fast mining algorithms, the popular C implementations of Apriori and Eclat by Christian Borgelt. These algorithms can be used to mine frequent itemsets, maximal frequent itemsets, closed frequent itemsets and association rules.

Research Papers in Economics

A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

Author: Adibi N
Ahmadzadeh MR
Barati E
Mohammadi A
Saraee MH
Publication venue: Cyber Journals
Publication date: 01/03/2011
Field of study

Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

University of Salford Institutional Repository

Recommended from our members

Mining High Impact Combinations of Conditions from the Medical Expenditure Panel Survey

Author: Mohan Arjun
Publication venue: ScholarWorks@UMass Amherst
Publication date: 14/11/2023
Field of study

The condition of multimorbidity — the presence of two or more medical conditions in an individual — is a growing phenomenon worldwide. In the United States, multimorbid patients represent more than a third of the population and the trend is steadily increasing in an already aging population. There is thus a pressing need to understand the patterns in which multimorbidity occurs, and to better understand the nature of the care that is required to be provided to such patients. In this thesis, we use data from the Medical Expenditure Panel Survey (MEPS) from the years 2011 to 2015 to identify combinations of multiple chronic conditions (MCCs). We first quantify the significant heterogeneity observed in these combinations and how often they are observed across the five years. Next, using two criteria associated with each combination -- (a) the annual prevalence and (b) the annual median expenditure -- along with the concept of non-dominated Pareto fronts, we determine the degree of impact each combination has on the healthcare system. Our analysis reveals that combinations of four or more conditions are often mixtures of diseases that belong to different clinically meaningful groupings such as the metabolic disorders (diabetes, hypertension, hyperlipidemia); musculoskeletal conditions (osteoarthritis, spondylosis, back problems etc.); respiratory disorders (asthma, COPD etc.); heart conditions (atherosclerosis, myocardial infarction); and mental health conditions (anxiety disorders, depression etc.). Next, we use unsupervised learning techniques such as association rule mining and hierarchical clustering to visually explore the strength of the relationships/associations between different conditions and condition groupings. This interactive framework allows epidemiologists and clinicians (in particular primary care physicians) to have a systematic approach to understand the relationships between conditions and build a strategy with regards to screening, diagnosis and treatment over a longer term, especially for individuals at risk for more complications. The findings from this study aim to create a foundation for future work where a more holistic view of multimorbidity is possible

ScholarWorks@UMass Amherst

An Internet-enabled Knowledge Discovery Process

Author: Anand SS
Buchner AG
Hughes John
Mulvenna Maurice
Publication venue: 'PIDCC- Revista em Propriedade Intelectual - Direito Contemporaneo e Constituicao'
Publication date: 06/05/1999
Field of study

Ulster University's Research Portal

データベース･マーケテイングにおけるエッセイ

Author: Kholod Marina Viktorovna
Publication venue
Publication date: 07/09/2011
Field of study

経博(経営

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Rule Induction-Based Knowledge Discovery for Energy Efficiency

Author: Armour S
Chen Q
Fan Z
Kaleshi D
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Rule induction is a practical approach to knowledge discovery. Provided that a problem is developed, rule induction is able to return the knowledge that addresses the goal of this problem as if-then rules. The primary goals of knowledge discovery are for prediction and description. The rule format knowledge representation is easily understandable so as to enable users to make decisions. This paper presents the potential of rule induction for energy efficiency. In particular, three rule induction techniques are applied to derive knowledge from a dataset of thousands of Irish electricity customers’ time-series power consumption records, socio-demographic details, and other information, in order to address the following four problems: 1) discovering mathematically interesting knowledge that could be found useful; 2) estimating power consumption features for customers, so that personalized tariffs can be assigned; 3) targeting a subgroup of customers with high potential for peak demand shifting; and 4) identifying customer attitudes that dominate energy conservation

Keele Research Repository

Crossref

Explore Bristol Research