18,407 research outputs found
Specious rules: an efficient and effective unifying method for removing misleading and uninformative patterns in association rule mining
We present theoretical analysis and a suite of tests and procedures for
addressing a broad class of redundant and misleading association rules we call
\emph{specious rules}. Specious dependencies, also known as \emph{spurious},
\emph{apparent}, or \emph{illusory associations}, refer to a well-known
phenomenon where marginal dependencies are merely products of interactions with
other variables and disappear when conditioned on those variables.
The most extreme example is Yule-Simpson's paradox where two variables
present positive dependence in the marginal contingency table but negative in
all partial tables defined by different levels of a confounding factor. It is
accepted wisdom that in data of any nontrivial dimensionality it is infeasible
to control for all of the exponentially many possible confounds of this nature.
In this paper, we consider the problem of specious dependencies in the context
of statistical association rule mining. We define specious rules and show they
offer a unifying framework which covers many types of previously proposed
redundant or misleading association rules. After theoretical analysis, we
introduce practical algorithms for detecting and pruning out specious
association rules efficiently under many key goodness measures, including
mutual information and exact hypergeometric probabilities. We demonstrate that
the procedure greatly reduces the number of associations discovered, providing
an elegant and effective solution to the problem of association mining
discovering large numbers of misleading and redundant rules.Comment: Note: This is a corrected version of the paper published in SDM'17.
In the equation on page 4, the range of the sum has been correcte
Economic, demographic, and institutional determinants of life insurance consumption across countries.
[Dataset available: http://hdl.handle.net/10411/12892]
An Incremental Construction of Deep Neuro Fuzzy System for Continual Learning of Non-stationary Data Streams
Existing FNNs are mostly developed under a shallow network configuration
having lower generalization power than those of deep structures. This paper
proposes a novel self-organizing deep FNN, namely DEVFNN. Fuzzy rules can be
automatically extracted from data streams or removed if they play limited role
during their lifespan. The structure of the network can be deepened on demand
by stacking additional layers using a drift detection method which not only
detects the covariate drift, variations of input space, but also accurately
identifies the real drift, dynamic changes of both feature space and target
space. DEVFNN is developed under the stacked generalization principle via the
feature augmentation concept where a recently developed algorithm, namely
gClass, drives the hidden layer. It is equipped by an automatic feature
selection method which controls activation and deactivation of input attributes
to induce varying subsets of input features. A deep network simplification
procedure is put forward using the concept of hidden layer merging to prevent
uncontrollable growth of dimensionality of input space due to the nature of
feature augmentation approach in building a deep network structure. DEVFNN
works in the sample-wise fashion and is compatible for data stream
applications. The efficacy of DEVFNN has been thoroughly evaluated using seven
datasets with non-stationary properties under the prequential test-then-train
protocol. It has been compared with four popular continual learning algorithms
and its shallow counterpart where DEVFNN demonstrates improvement of
classification accuracy. Moreover, it is also shown that the concept drift
detection method is an effective tool to control the depth of network structure
while the hidden layer merging scenario is capable of simplifying the network
complexity of a deep network with negligible compromise of generalization
performance.Comment: This paper has been published in IEEE Transactions on Fuzzy System
Geographic range size and evolutionary age in birds
Together with patterns of speciation and extinction, post-speciation transformations in the range sizes of individual species determine the form of contemporary species-range-size distributions. However, the methodological problems associated with tracking the dynamics of a species' range size over evolutionary time have precluded direct study of such range-size transformations, although indirect evidence has led to several models being proposed describing the form that they might take. Here, we use independently derived molecular data to estimate ages of species in six monophyletic groups of birds, and examine the relationship between species age and global geographic range size. We present strong evidence that avian range sizes are not static over evolutionary time. In addition, it seems that, with the regular exception of certain taxa (for example island endemics and some threatened species), range-size transformations are non-random in birds. In general, range sizes appear to expand relatively rapidly post speciation; subsequently, and perhaps more gradually, they then decline as species age. We discuss these results with reference to the various models of range-size dynamics that have been proposed
The Education of Real Estate Salespeople and the Value of the Firm
In order to protect the public, most states require salespeople and brokers to meet specific licensing requirements, typically in the form of classroom instruction and/or successful completion of an examination. Frequently, however, many real estate brokers require their sales staff to undertake education that exceeds these minimum requirements. In this study, we derive a theoretical model that shows how optimally-timed, firm provided education that exceeds legal minimums can increase staff productivity, reduce litigation risks and perhaps raise and/or maximize the expected value of the firm.
- …