18,407 research outputs found

    Specious rules: an efficient and effective unifying method for removing misleading and uninformative patterns in association rule mining

    Full text link
    We present theoretical analysis and a suite of tests and procedures for addressing a broad class of redundant and misleading association rules we call \emph{specious rules}. Specious dependencies, also known as \emph{spurious}, \emph{apparent}, or \emph{illusory associations}, refer to a well-known phenomenon where marginal dependencies are merely products of interactions with other variables and disappear when conditioned on those variables. The most extreme example is Yule-Simpson's paradox where two variables present positive dependence in the marginal contingency table but negative in all partial tables defined by different levels of a confounding factor. It is accepted wisdom that in data of any nontrivial dimensionality it is infeasible to control for all of the exponentially many possible confounds of this nature. In this paper, we consider the problem of specious dependencies in the context of statistical association rule mining. We define specious rules and show they offer a unifying framework which covers many types of previously proposed redundant or misleading association rules. After theoretical analysis, we introduce practical algorithms for detecting and pruning out specious association rules efficiently under many key goodness measures, including mutual information and exact hypergeometric probabilities. We demonstrate that the procedure greatly reduces the number of associations discovered, providing an elegant and effective solution to the problem of association mining discovering large numbers of misleading and redundant rules.Comment: Note: This is a corrected version of the paper published in SDM'17. In the equation on page 4, the range of the sum has been correcte

    Economic, demographic, and institutional determinants of life insurance consumption across countries.

    Get PDF
    [Dataset available: http://hdl.handle.net/10411/12892]

    An Incremental Construction of Deep Neuro Fuzzy System for Continual Learning of Non-stationary Data Streams

    Full text link
    Existing FNNs are mostly developed under a shallow network configuration having lower generalization power than those of deep structures. This paper proposes a novel self-organizing deep FNN, namely DEVFNN. Fuzzy rules can be automatically extracted from data streams or removed if they play limited role during their lifespan. The structure of the network can be deepened on demand by stacking additional layers using a drift detection method which not only detects the covariate drift, variations of input space, but also accurately identifies the real drift, dynamic changes of both feature space and target space. DEVFNN is developed under the stacked generalization principle via the feature augmentation concept where a recently developed algorithm, namely gClass, drives the hidden layer. It is equipped by an automatic feature selection method which controls activation and deactivation of input attributes to induce varying subsets of input features. A deep network simplification procedure is put forward using the concept of hidden layer merging to prevent uncontrollable growth of dimensionality of input space due to the nature of feature augmentation approach in building a deep network structure. DEVFNN works in the sample-wise fashion and is compatible for data stream applications. The efficacy of DEVFNN has been thoroughly evaluated using seven datasets with non-stationary properties under the prequential test-then-train protocol. It has been compared with four popular continual learning algorithms and its shallow counterpart where DEVFNN demonstrates improvement of classification accuracy. Moreover, it is also shown that the concept drift detection method is an effective tool to control the depth of network structure while the hidden layer merging scenario is capable of simplifying the network complexity of a deep network with negligible compromise of generalization performance.Comment: This paper has been published in IEEE Transactions on Fuzzy System

    Geographic range size and evolutionary age in birds

    Get PDF
    Together with patterns of speciation and extinction, post-speciation transformations in the range sizes of individual species determine the form of contemporary species-range-size distributions. However, the methodological problems associated with tracking the dynamics of a species' range size over evolutionary time have precluded direct study of such range-size transformations, although indirect evidence has led to several models being proposed describing the form that they might take. Here, we use independently derived molecular data to estimate ages of species in six monophyletic groups of birds, and examine the relationship between species age and global geographic range size. We present strong evidence that avian range sizes are not static over evolutionary time. In addition, it seems that, with the regular exception of certain taxa (for example island endemics and some threatened species), range-size transformations are non-random in birds. In general, range sizes appear to expand relatively rapidly post speciation; subsequently, and perhaps more gradually, they then decline as species age. We discuss these results with reference to the various models of range-size dynamics that have been proposed

    The Education of Real Estate Salespeople and the Value of the Firm

    Get PDF
    In order to protect the public, most states require salespeople and brokers to meet specific licensing requirements, typically in the form of classroom instruction and/or successful completion of an examination. Frequently, however, many real estate brokers require their sales staff to undertake education that exceeds these minimum requirements. In this study, we derive a theoretical model that shows how optimally-timed, firm provided education that exceeds legal minimums can increase staff productivity, reduce litigation risks and perhaps raise and/or maximize the expected value of the firm.
    corecore