386 research outputs found

    Categorization of interestingness measures for knowledge extraction

    Full text link
    Finding interesting association rules is an important and active research field in data mining. The algorithms of the Apriori family are based on two rule extraction measures, support and confidence. Although these two measures have the virtue of being algorithmically fast, they generate a prohibitive number of rules most of which are redundant and irrelevant. It is therefore necessary to use further measures which filter uninteresting rules. Many synthesis studies were then realized on the interestingness measures according to several points of view. Different reported studies have been carried out to identify "good" properties of rule extraction measures and these properties have been assessed on 61 measures. The purpose of this paper is twofold. First to extend the number of the measures and properties to be studied, in addition to the formalization of the properties proposed in the literature. Second, in the light of this formal study, to categorize the studied measures. This paper leads then to identify categories of measures in order to help the users to efficiently select an appropriate measure by choosing one or more measure(s) during the knowledge extraction process. The properties evaluation on the 61 measures has enabled us to identify 7 classes of measures, classes that we obtained using two different clustering techniques.Comment: 34 pages, 4 figure

    Analysis of monotonicity properties of some rule interestingness measures

    Get PDF
    One of the crucial problems in the field of knowledge discovery is development of good interestingness measures for evaluation of the discovered patterns. In this paper, we consider quantitative, objective interestingness measures for "if..., then... " association rules. We focus on three popular interestingness measures, namely rule interest function of Piatetsky-Shapiro, gain measure of Fukuda et al., and dependency factor used by Pawlak. We verify whether they satisfy the valuable property M of monotonic dependency on the number of objects satisfying or not the premise or the conclusion of a rule, and property of hypothesis symmetry (HS). Moreover, analytically and through experiments we show an interesting relationship between those measures and two other commonly used measures of rule support and anti-support

    Channels’ Confirmation and Predictions’ Confirmation: From the Medical Test to the Raven Paradox

    Get PDF
    After long arguments between positivism and falsificationism, the verification of universal hypotheses was replaced with the confirmation of uncertain major premises. Unfortunately, Hemple proposed the Raven Paradox. Then, Carnap used the increment of logical probability as the confirmation measure. So far, many confirmation measures have been proposed. Measure F proposed by Kemeny and Oppenheim among them possesses symmetries and asymmetries proposed by Elles and Fitelson, monotonicity proposed by Greco et al., and normalizing property suggested by many researchers. Based on the semantic information theory, a measure b* similar to F is derived from the medical test. Like the likelihood ratio, measures b* and F can only indicate the quality of channels or the testing means instead of the quality of probability predictions. Furthermore, it is still not easy to use b*, F, or another measure to clarify the Raven Paradox. For this reason, measure c* similar to the correct rate is derived. Measure c* supports the Nicod Criterion and undermines the Equivalence Condition, and hence, can be used to eliminate the Raven Paradox. An example indicates that measures F and b* are helpful for diagnosing the infection of Novel Coronavirus, whereas most popular confirmation measures are not. Another example reveals that all popular confirmation measures cannot be used to explain that a black raven can confirm “Ravens are black” more strongly than a piece of chalk. Measures F, b*, and c* indicate that the existence of fewer counterexamples is more important than more positive examples’ existence, and hence, are compatible with Popper’s falsification thought

    Feeling the landscape: six psychological studies into landscape experience

    Get PDF
    In de zes studies van deze dissertatie zijn een aantal zowel praktische als theoretische vraagstukken met betrekking tot de beleving van landschap onderzocht. Landschapsbeleving wordt gedefinieerd als een dynamisch proces, als het resultaat van interacties tussen cultureel en biologisch bepaalde, algemene determinanten van de ervaring. In de studies wordt een aantal verschillende psychologische theoriën getoetst, en samen tonen deze het belang aan van psychologisch onderzoek naar landschapsbeleving. Het is de toepassing van methodologiën en theoretische perspectieven uit de psychologie, die het mogelijk heeft gemaakt tot de inzichten te komen over de interactie tussen mens en landschap, die het resultaat zijn van deze studie

    A mathematical theory of evidence for G.L.S. Shackle

    Get PDF
    Evidence Theory is a branch of mathematics that concerns the combination of empirical evidence in an individual's mind in order to construct a coherent picture of reality. Designed to deal with unexpected empirical evidence suggesting new possibilities, evidence theory has a lot in common with Shackle's idea of decision-making as a creative act. This essay investigates this connection in detail, pointing to the usefulness of evidence theory to formalise and extend Shackle's decision theory. In order to ease a proper framing of the issues involved, evidence theory is not only compared with Shackle's ideas but also with additive and sub-additive probability theories. Furthermore, the presentation of evidence theory does not refer to the original version only, but takes account of its most recent developments, too.

    Macroeconomic Applications of Bayesian Model Averaging

    Get PDF
    Bayesian Model Averaging (BMA) is a common econometric tool to assess the uncertainty regarding model specification and parameter inference and is widely applied in fields where no strong theoretical guidelines are present. Its major advantage over single-equation models is the combination of evidence from a large number of specifications. The three papers included in this thesis all investigate model structures in the BMA model space. The first contribution evaluates how priors can be chosen to enforce model structures in the presence of interactions terms and multicollinearity. This is linked to a discussion in the Journal of Applied Econometrics regarding the question whether being a Sub-Saharan African country makes a difference for growth modelling. The second essay is concerned with clusters of different models in the model space. We apply Latent Class Analysis to the set of sampled models from BMA and identify different subsets (kinds of) models for two well-known growth data sets. The last paper focuses on the application of "jointness", which tries to find bivariate relationships between regressors in BMA. Accordingly this approach attempts to identify substitutes and complements by linking the econometric discussion on this subject to the field of Machine Learning.(author's abstract
    • 

    corecore