12,414 research outputs found

    A hybrid heuristic approach for attribute-oriented mining

    Get PDF
    We present a hybrid heuristic algorithm, clusterAOI, that generates a more interesting generalised table than obtained via attribute-oriented induction (AOI). AOI tends to overgeneralise as it uses a fixed global static threshold to cluster and generalise attributes irrespective of their features, and does not evaluate intermediate interestingness. In contrast, clusterAOI uses attribute features to dynamically recalculate new attribute thresholds and applies heuristics to evaluate cluster quality and intermediate interestingness. Experimental results show improved interestingness, better output pattern distribution and expressiveness, and improved runtime. © 2013 Elsevier B.V

    Rough Sets Clustering and Markov model for Web Access Prediction

    Get PDF
    Discovering user access patterns from web access log is increasing the importance of information to build up adaptive web server according to the individual user’s behavior. The variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. In this paper, we present a rough set clustering to cluster web transactions from web access logs and using Markov model for next access prediction. Using this approach, users can effectively mine web log records to discover and predict access patterns. We perform experiments using real web trace logs collected from www.dusit.ac.th servers. In order to improve its prediction ration, the model includes a rough sets scheme in which search similarity measure to compute the similarity between two sequences using upper approximation

    A Survey of Parallel Data Mining

    Get PDF
    With the fast, continuous increase in the number and size of databases, parallel data mining is a natural and cost-effective approach to tackle the problem of scalability in data mining. Recently there has been a considerable research on parallel data mining. However, most projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This paper surveys parallel data mining with a broader perspective. More precisely, we discuss the parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule induction, instance-based learning, genetic algorithms and neural networks. Using the lessons learned from this discussion, we also derive a set of heuristic principles for designing efficient parallel data mining algorithms
    corecore