Search CORE

3,396 research outputs found

Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

Author: A Freitas
C C Aggarwal P S Y
D Gunopulos R Khardon, H Mannila, S Sal
Georg Gottlob
J-F Boulicaut A Bykowski, C Rigotti: Fr
J-L Guigues V Duquenne:
José Balcázar
R Agrawal T Imielinski, A Swam
R Dechter J Pearl:
R Khardon D Roth
T Calders B Goethals:
T Eiter G Gottlob
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2009
Field of study

Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

Episciences.org

Quantitative Redundancy in Partial Implications

Author: Balcázar José L.
Publication venue
Publication date: 01/01/2015
Field of study

We survey the different properties of an intuitive notion of redundancy, as a function of the precise semantics given to the notion of partial implication. The final version of this survey will appear in the Proceedings of the Int. Conf. Formal Concept Analysis, 2015.Comment: Int. Conf. Formal Concept Analysis, 201

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

The Bases of Association Rules of High Confidence

Author: Adaricheva Kira
Cabot-Miller Justin
Nation J. B.
Segal Oren
Sharafudinov Anuar
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 05/08/2018
Field of study

We develop a new approach for distributed computing of the association rules of high confidence in a binary table. It is derived from the D-basis algorithm in K. Adaricheva and J.B. Nation (TCS 2017), which is performed on multiple sub-tables of a table given by removing several rows at a time. The set of rules is then aggregated using the same approach as the D-basis is retrieved from a larger set of implications. This allows to obtain a basis of association rules of high confidence, which can be used for ranking all attributes of the table with respect to a given fixed attribute using the relevance parameter introduced in K. Adaricheva et al. (Proceedings of ICFCA-2015). This paper focuses on the technical implementation of the new algorithm. Some testing results are performed on transaction data and medical data.Comment: Presented at DTMN, Sydney, Australia, July 28, 201

arXiv.org e-Print Archive

Crossref

Relative Entailment Among Probabilistic Implications

Author: Atserias Albert
Balcázar José L.
Piceno Marie Ely
Publication venue
Publication date: 01/01/2019
Field of study

We study a natural variant of the implicational fragment of propositional logic. Its formulas are pairs of conjunctions of positive literals, related together by an implicational-like connective; the semantics of this sort of implication is defined in terms of a threshold on a conditional probability of the consequent, given the antecedent: we are dealing with what the data analysis community calls confidence of partial implications or association rules. Existing studies of redundancy among these partial implications have characterized so far only entailment from one premise and entailment from two premises, both in the stand-alone case and in the case of presence of additional classical implications (this is what we call "relative entailment"). By exploiting a previously noted alternative view of the entailment in terms of linear programming duality, we characterize exactly the cases of entailment from arbitrary numbers of premises, again both in the stand-alone case and in the case of presence of additional classical implications. As a result, we obtain decision algorithms of better complexity; additionally, for each potential case of entailment, we identify a critical confidence threshold and show that it is, actually, intrinsic to each set of premises and antecedent of the conclusion

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Episciences.org

Objective novelty of association rules: measuring the confidence boost

Author: Balcázar Navarro José Luis
Publication venue
Publication date: 01/01/2010
Field of study

On sait bien que la confiance des régles d’association n’est pas vraiment satisfaisant comme mésure d’interêt. Nous proposons, au lieu de la substituer par des autres mésures (soit, en l’employant de façon conjointe a des autres mésures), évaluer la nouveauté de chaque régle par comparaison de sa confiance par rapport á des régles plus fortes qu’on trouve au même ensemble de données. C’est á dire, on considère un seuil “relative” de confiance au lieu du seuil absolute habituel. Cette idée se précise avec la magnitude du “confidence boost”, mésurant l’increment rélative de confiance prés des régles plus fortes. Nous prouvons que nôtre proposte peut remplacer la “confidence width” et le blockage de régles employés a des publications précedentes.Postprint (author’s final draft

UPCommons. Portal del coneixement obert de la UPC

Closed-set-based discovery of representative association rules

Author: Balcázar Navarro José Luis
Gómez Pérez Domingo
Tîrnauca Cristina
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2020
Field of study

The output of an association rule miner is often huge in practice. This is why several concise lossless representations have been proposed, such as the “essential” or “representative” rules. A previously known algorithm for mining representative rules relies on an incorrect mathematical claim, and can be seen to miss part of its intended output; in previous work, two of the authors of the present paper have offered a complete but, often, somewhat slower alternative. Here, we extend this alternative to the case of closure-based redundancy. The empirical validation shows that, in this way, we can improve on the original time efficiency, without sacrificing completeness.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC