10,770 research outputs found
Recommended from our members
Incremental learning of independent, overlapping, and graded concept descriptions with an instance-based process framework
Supervised learning algorithms make several simplifying assumptions concerning the characteristics of the concept descriptions to be learned. For example, concepts are often assumed to be (1) defined with respect to the same set of relevant attributes, (2) disjoint in instance space, and (3) have uniform instance distributions. While these assumptions constrain the learning task, they unfortunately limit an algorithm's applicability. We believe that supervised learning algorithms should learn attribute relevancies independently for each concept, allow instances to be members of any subset of concepts, and represent graded concept descriptions. This paper introduces a process framework for instance-based learning algorithms that exploit only specific instance and performance feedback information to guide their concept learning processes. We also introduce Bloom, a specific instantiation of this framework. Bloom is a supervised, incremental, instance-based learning algorithm that learns relative attribute relevancies independently for each concept, allows instances to be members of any subset of concepts, and represents graded concept memberships. We describe empirical evidence to support our claims that Bloom can learn independent, overlapping, and graded concept descriptions
A study on incremental mining of frequent patterns
Data generated from both the offline and online sources are incremental in nature. Changes in the underlying database occur due to the incremental data. Mining frequent patterns are costly in changing databases, since it requires scanning the database from the start. Thus, mining of growing databases has been a great concern. To mine the growing databases, a new Data Mining technique called Incremental Mining has emerged. The Incremental Mining uses previous mining result to get the desired knowledge by reducing mining costs in terms of time and space. This state of the art paper focuses on Incremental Mining approaches and identifies suitable approaches which are the need of real world problem.Keywords: Data Mining, Frequent Pattern, Incremental Mining, Frequent Pattern Minung, High Utility Mining, Constraint Mining
A review of associative classification mining
Associative classification mining is a promising approach in data mining that utilizes the
association rule discovery techniques to construct classification systems, also known as
associative classifiers. In the last few years, a number of associative classification algorithms
have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms
employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule
evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative
classification techniques with regards to the above criteria. Finally, future directions in associative
classification, such as incremental learning and mining low-quality data sets, are also
highlighted in this paper
Query-driven learning for predictive analytics of data subspace cardinality
Fundamental to many predictive analytics tasks is the ability to estimate the cardinality (number of data items) of multi-dimensional data subspaces, defined by query selections over datasets. This is crucial for data analysts dealing with, e.g., interactive data subspace explorations, data subspace visualizations, and in query processing optimization. However, in many modern data systems, predictive analytics may be (i) too costly money-wise, e.g., in clouds, (ii) unreliable, e.g., in modern Big Data query engines, where accurate statistics are difficult to obtain/maintain, or (iii) infeasible, e.g., for privacy issues. We contribute a novel, query-driven, function estimation model of analyst-defined data subspace cardinality. The proposed estimation model is highly accurate in terms of prediction and accommodating the well-known selection queries: multi-dimensional range and distance-nearest neighbors (radius) queries. Our function estimation model: (i) quantizes the vectorial query space, by learning the analysts’ access patterns over a data space, (ii) associates query vectors with their corresponding cardinalities of the analyst-defined data subspaces, (iii) abstracts and employs query vectorial similarity to predict the cardinality of an unseen/unexplored data subspace, and (iv) identifies and adapts to possible changes of the query subspaces based on the theory of optimal stopping. The proposed model is decentralized, facilitating the scaling-out of such predictive analytics queries. The research significance of the model lies in that (i) it is an attractive solution when data-driven statistical techniques are undesirable or infeasible, (ii) it offers a scale-out, decentralized training solution, (iii) it is applicable to different selection query types, and (iv) it offers a performance that is superior to that of data-driven approaches
- …