Search CORE

17 research outputs found

Rule Learning: From Local Patterns to Global Models

Author: Sulzmann Jan-Nikolas
Publication venue
Publication date: 17/01/2019
Field of study

In many areas of daily life (e.g. in e-commerce or social networks), massive amounts of data are collected and stored in databases (for future use). Even though the specific information contained in the collected data may already be interesting, more general insights into the data would be more useful. Clearly, a data analysis should aim for a discovery of such pieces of knowledge, but a human inspection becomes less and less feasible to do as the databases become more and more unmanageable. To this end, the KDD process (short for ``Knowledge Discovery in Databases'') provides the tools for a semi-automatic data analysis. Data mining, which is the main component of the KDD process, searches the explicit facts for regularities which represent pieces of knowledge. Usually, these regularities are formulated as local patterns which describe only local characteristics of the data or as global models which explain the whole data. In our work, we will concentrate on local patterns and global models that may be used to predict a feature of interest or class attribute for future and unknown data. Interestingly, predictive local patterns may be used to obtain global predictions in two ways. The integrative approach treats the local patterns as building blocks and builds with their help a global model. The decoding approach aggregates the predictions of the local patterns into a single global prediction. While both approaches are promising, the question, how local patterns may be employed for global modelling, has not been answered satisfactorily yet. To this end, we consider three important aspects of this question in this work. The first aspect is, how may a set of local patterns be employed to obtain optimal global predictions. The LeGo framework (an acronym for ``from \textbf{l}ocal patt\textbf{e}rns to \textbf{g}lobal m\textbf{o}dels'') provides an approach to answer this question. It divides the data mining process into three subsequent steps: the local pattern discovery generates a set of local patterns, the pattern set discovery step selects a smaller subset from the set of local patterns, and the global modelling employs the reduced pattern set to build a global model. There are many methods available for each step. So, we employ a selection of methods for each step and evaluate their performances with respect to the first considered aspect empirically. The second aspect is, how may a set of local patterns be utilised to obtain optimal class probabilities. Often class probabilities may be more useful than a simple prediction as they may be used as a confidence measure in the prediction (e.g. in voting schemes). We divide this aspect into two sub tasks: the probability estimation and the probability aggregation. The probability estimation calculates class probabilities given a single local pattern. For this task, we consider basic probability estimation methods and shrinkage which is a technique to smooth the basic probability estimations. Furthermore, we examine the effect of the local pattern discovery on the quality of the probability estimation. The probability aggregation decodes the probability estimations of multiple patterns into a single probability estimation. For this purpose, we evaluate the performances of a selection of aggregation methods. The third aspect is, how may a set of local patterns be transformed into a compact and understandable model. Usually, local pattern sets are hard to interpret and their utilisation for prediction necessitates additional efforts (e.g. voting schemes). These issues may be solved at once if the local patterns are employed to obtain a global model. To this end, we introduce rule stacking, which is a novel approach for global modelling. Rule stacking advances the standard stacking approach in two aspects: the meta data generation and the additional retransformation of the meta model. In this way, we obtain a compressed and interpretable global model that is directly applicable to future data

TUbiblio

tuprints

Pairwise Naive Bayes Classifier

Author: Sulzmann Jan-Nikolas
Publication venue
Publication date: 06/05/2011
Field of study

Class binarizations are effective methods that break multi-class problem down into several 2- class or binary problems to improve weak learners. This paper analyzes which effects these methods have if we choose a Naive Bayes learner for the base classifier. We consider the known unordered and pairwise class binarizations and propose an alternative approach for a pairwise calculation of a modified Naive Bayes classifier

University of Hildesheim

Rule Learning: From Local Patterns to Global Models

Author: Sulzmann Jan-Nikolas
Publication venue
Publication date: 17/01/2019
Field of study

TUbiblio

Pairwise Naive Bayes Classifier

Author: Sulzmann Jan-Nikolas
Publication venue: Gesellschaft für Informatik e. V. (GI)
Publication date: 01/01/2006
Field of study

TUbiblio

Paarweiser Naive Bayes Klassifizierer

Author: Sulzmann Jan-Nikolas
Publication venue
Publication date
Field of study

TUbiblio

A Comparison of Techniques for Selecting and Combining Class Association Rules

Author: Fürnkranz Johannes
Sulzmann Jan-Nikolas
Publication venue
Publication date: 01/01/2008
Field of study

TUbiblio

Rule Stacking: An approach for compressing an ensemble of rule sets into a single classifier

Author: Fürnkranz Johannes
Sulzmann Jan-Nikolas
Publication venue
Publication date: 01/01/2010
Field of study

In this paper, we present an approach for compressing a rule-based pairwise classifier ensemble into a single rule set that can be directly used for classification. The key idea is to re-encode the training examples using information about which of the original ruler covers the example, and to use them for training a rule-based meta-level classifier. We not only show that this approach is more accurate than using the same classifier at the base level (which could have been expected for such a variant of stacking), but also demonstrate that the resulting meta-level rule set can be straight-forwardly translated back into a rule set at the base level. Our key result is that the rule sets obtained in this way are of comparable complexity to those of the original rule learner, but considerably more accurate

TUbiblio

A Comparison of Techniques for Selecting and Combining Class Association Rules

Author: Fürnkranz Johannes
Sulzmann Jan-Nikolas
Publication venue
Publication date: 01/01/2008
Field of study

TUbiblio

A Study of Probability Estimation Techniques for Rule Learning ⋆

Author: Jan-nikolas Sulzmann
Johannes Fürnkranz
Publication venue
Publication date: 01/01/2009
Field of study

Abstract. Rule learning is known for its descriptive and therefore comprehensible classification models which also yield good class predictions. However, in some application areas, we also need good class probability estimates. For different classification models, such as decision trees, a variety of techniques for obtaining good probability estimates have been proposed and evaluated. However, so far, there has been no systematic empirical study of how these techniques can be adapted to probabilistic rules and how these methods affect the probability-based rankings. In this paper we apply several basic methods for the estimation of class membership probabilities to classification rules. We also study the effect of a shrinkage technique for merging the probability estimates of rules with those of their generalizations.

CiteSeerX

TUbiblio