371 research outputs found

    Expressive generalized itemsets

    Get PDF
    Generalized itemset mining is a powerful tool to discover multiple-level correlations among the analyzed data. A taxonomy is used to aggregate data items into higher-level concepts and to discover frequent recurrences among data items at different granularity levels. However, since traditional high-level itemsets may also represent the knowledge covered by their lower-level frequent descendant itemsets, the expressiveness of high-level itemsets can be rather limited. To overcome this issue, this article proposes two novel itemset types, called Expressive Generalized Itemset (EGI) and Maximal Expressive Generalized Itemset (Max-EGI), in which the frequency of occurrence of a high-level itemset is evaluated only on the portion of data not yet covered by any of its frequent descendants. Specifically, EGI s represent, at a high level of abstraction, the knowledge associated with sets of infrequent itemsets, while Max-EGIs compactly represent all the infrequent descendants of a generalized itemset. Furthermore, we also propose an algorithm to discover Max-EGIs at the top of the traditionally mined itemsets. Experiments, performed on both real and synthetic datasets, demonstrate the effectiveness, efficiency, and scalability of the proposed approac

    Efficient Closed Pattern Mining in the Presence of Tough Block Constraints

    Get PDF
    In recent years, various constrained frequent pattern mining problem formulations and associated algorithms have been developed that enable the user to specify various itemsetbased constraints that better capture the underlying application requirements and characteristics. In this paper we introduce a new class of block constraints that determine the significance of an itemset pattern by considering the dense block that is formed by the pattern's items and its associated set of transactions. Block constraints provide a natural framework by which a number of important problems can be specified and make it possible to solve numerous problems on binary and real-valued datasets. However, developing computationally e#cient algorithms to find these block constraints poses a number of challenges as unlike the di#erent itemset-based constraints studied earlier, these block constraints are tough as they are neither anti-monotone, monotone, nor convertible. To overcome this problem, we introduce a new class of pruning methods that can be used to significantly reduce the overall search space and make it possible to develop computationally e#cient block constraint mining algorithms. We present an algorithm called CBMiner that takes advantage of these pruning methods to develop an algorithm for finding the closed itemsets that satisfy the block constraints. Our extensive performance study shows that CBMiner generates more concise result set and can be order(s) of magnitude faster than the traditional frequent closed itemset mining algorithms

    Discover, recycle and reuse frequent patterns in association rule mining

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Mining subjectively interesting patterns in rich data

    Get PDF

    Colossal Trajectory Mining: A unifying approach to mine behavioral mobility patterns

    Get PDF
    Spatio-temporal mobility patterns are at the core of strategic applications such as urban planning and monitoring. Depending on the strength of spatio-temporal constraints, different mobility patterns can be defined. While existing approaches work well in the extraction of groups of objects sharing fine-grained paths, the huge volume of large-scale data asks for coarse-grained solutions. In this paper, we introduce Colossal Trajectory Mining (CTM) to efficiently extract heterogeneous mobility patterns out of a multidimensional space that, along with space and time dimensions, can consider additional trajectory features (e.g., means of transport or activity) to characterize behavioral mobility patterns. The algorithm is natively designed in a distributed fashion, and the experimental evaluation shows its scalability with respect to the involved features and the cardinality of the trajectory dataset

    Exploring Data Hierarchies to Discover Knowledge in Different Domains

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Enhancing web marketing by using ontology

    Get PDF
    The existence of the Web has a major impact on people\u27s life styles. Online shopping, online banking, email, instant messenger services, search engines and bulletin boards have gradually become parts of our daily life. All kinds of information can be found on the Web. Web marketing is one of the ways to make use of online information. By extracting demographic information and interest information from the Web, marketing knowledge can be augmented by applying data mining algorithms. Therefore, this knowledge which connects customers to products can be used for marketing purposes and for targeting existing and potential customers. The Web Marketing Project with Ontology Support has the purpose to find and improve marketing knowledge. In the Web Marketing Project, association rules about marketing knowledge have been derived by applying data mining algorithms to existing Web users\u27 data. An ontology was used as a knowledge backbone to enhance data mining for marketing. The Raising Method was developed by taking advantage of the ontology. Data are preprocessed by Raising before being fed into data mining algorithms. Raising improves the quality of the set of mined association rules by increasing the average support value. Also, new rules have been discovered after applying Raising. This dissertation thoroughly describes the development and analysis of the Raising method. Moreover, a new structure, called Intersection Ontology, is introduced to represent customer groups on demand. Only needed customer nodes are created. Such an ontology is used to simplify the marketing knowledge representation. Finally, some additional ontology usages are mentioned. By integrating an ontology into Web marketing, the marketing process support has been greatly improved
    corecore