3,018 research outputs found
Mining for Useful Association Rules Using the ATMS
Association rule mining has made many achievements in the area of knowledge discovery in databases. Recent years, the quality of the extracted association rules has drawn more and more attention from researchers in data mining community. One big concern is with the size of the extracted rule set. Very often tens of thousands of association rules are extracted among which many are redundant thus useless. In this paper, we first analyze the redundancy problem in association rules and then propose a novel ATMS-based method for extracting non-redundant association rules
Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules
Association rules are among the most widely employed data analysis methods in
the field of Data Mining. An association rule is a form of partial implication
between two sets of binary variables. In the most common approach, association
rules are parameterized by a lower bound on their confidence, which is the
empirical conditional probability of their consequent given the antecedent,
and/or by some other parameter bounds such as "support" or deviation from
independence. We study here notions of redundancy among association rules from
a fundamental perspective. We see each transaction in a dataset as an
interpretation (or model) in the propositional logic sense, and consider
existing notions of redundancy, that is, of logical entailment, among
association rules, of the form "any dataset in which this first rule holds must
obey also that second rule, therefore the second is redundant". We discuss
several existing alternative definitions of redundancy between association
rules and provide new characterizations and relationships among them. We show
that the main alternatives we discuss correspond actually to just two variants,
which differ in the treatment of full-confidence implications. For each of
these two notions of redundancy, we provide a sound and complete deduction
calculus, and we show how to construct complete bases (that is,
axiomatizations) of absolutely minimum size in terms of the number of rules. We
explore finally an approach to redundancy with respect to several association
rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape
An information-driven framework for image mining
[Abstract]: Image mining systems that can automatically extract semantically meaningful information (knowledge) from image data are increasingly in demand. The fundamental challenge in image mining is to determine how low-level, pixel representation contained in a raw image or
image sequence can be processed to identify high-level spatial objects and relationships. To meet
this challenge, we propose an efficient information-driven framework for image mining. We distinguish four levels of information: the Pixel Level, the Object Level, the Semantic Concept Level, and the Pattern and Knowledge Level. High-dimensional indexing schemes and retrieval
techniques are also included in the framework to support the flow of information among the levels. We believe this framework represents the first step towards capturing the different levels of information present in image data and addressing the issues and challenges of discovering useful
patterns/knowledge from each level
- …