6,871 research outputs found

    Mining frequent itemsets a perspective from operations research

    Get PDF
    Many papers on frequent itemsets have been published. Besides somecontests in this field were held. In the majority of the papers the focus ison speed. Ad hoc algorithms and datastructures were introduced. Inthis paper we put most of the algorithms in one framework, usingclassical Operations Research paradigms such as backtracking, depth-first andbreadth-first search, and branch-and-bound. Moreover we presentexperimental results where the different algorithms are implementedunder similar designs.data mining;operation research;Frequent itemsets

    Constraining the Search Space in Temporal Pattern Mining

    Get PDF
    Agents in dynamic environments have to deal with complex situations including various temporal interrelations of actions and events. Discovering frequent patterns in such scenes can be useful in order to create prediction rules which can be used to predict future activities or situations. We present the algorithm MiTemP which learns frequent patterns based on a time intervalbased relational representation. Additionally the problem has also been transfered to a pure relational association rule mining task which can be handled by WARMR. The two approaches are compared in a number of experiments. The experiments show the advantage of avoiding the creation of impossible or redundant patterns with MiTemP. While less patterns have to be explored on average with MiTemP more frequent patterns are found at an earlier refinement level

    Mining frequent itemsets a perspective from operations research

    Get PDF
    Many papers on frequent itemsets have been published. Besides some contests in this field were held. In the majority of the papers the focus is on speed. Ad hoc algorithms and datastructures were introduced. In this paper we put most of the algorithms in one framework, using classical Operations Research paradigms such as backtracking, depth-first and breadth-first search, and branch-and-bound. Moreover we present experimental results where the different algorithms are implemented under similar designs

    A new technique for intelligent web personal recommendation

    Get PDF
    Personal recommendation systems nowadays are very important in web applications because of the available huge volume of information on the World Wide Web, and the necessity to save users’ time, and provide appropriate desired information, knowledge, items, etc. The most popular recommendation systems are collaborative filtering systems, which suffer from certain problems such as cold-start, privacy, user identification, and scalability. In this thesis, we suggest a new method to solve the cold start problem taking into consideration the privacy issue. The method is shown to perform very well in comparison with alternative methods, while having better properties regarding user privacy. The cold start problem covers the situation when recommendation systems have not sufficient information about a new user’s preferences (the user cold start problem), as well as the case of newly added items to the system (the item cold start problem), in which case the system will not be able to provide recommendations. Some systems use users’ demographical data as a basis for generating recommendations in such cases (e.g. the Triadic Aspect method), but this solves only the user cold start problem and enforces user’s privacy. Some systems use users’ ’stereotypes’ to generate recommendations, but stereotypes often do not reflect the actual preferences of individual users. While some other systems use user’s ’filterbots’ by injecting pseudo users or bots into the system and consider these as existing ones, but this leads to poor accuracy. We propose the active node method, that uses previous and recent users’ browsing targets and browsing patterns to infer preferences and generate recommendations (node recommendations, in which a single suggestion is given, and batch recommendations, in which a set of possible target nodes are shown to the user at once). We compare the active node method with three alternative methods (Triadic Aspect Method, Naïve Filterbots Method, and MediaScout Stereotype Method), and we used a dataset collected from online web news to generate recommendations based on our method and based on the three alternative methods. We calculated the levels of novelty, coverage, and precision in these experiments, and we found that our method achieves higher levels of novelty in batch recommendation while achieving higher levels of coverage and precision in node recommendations comparing to these alternative methods. Further, we develop a variant of the active node method that incorporates semantic structure elements. A further experimental evaluation with real data and users showed that semantic node recommendation with the active node method achieved higher levels of novelty than nonsemantic node recommendation, and semantic-batch recommendation achieved higher levels of coverage and precision than non-semantic batch recommendation

    Twitter data analysis by means of Strong Flipping Generalized Itemsets

    Get PDF
    Twitter data has recently been considered to perform a large variety of advanced analysis. Analysis ofTwitter data imposes new challenges because the data distribution is intrinsically sparse, due to a large number of messages post every day by using a wide vocabulary. Aimed at addressing this issue, generalized itemsets - sets of items at different abstraction levels - can be effectively mined and used todiscover interesting multiple-level correlations among data supplied with taxonomies. Each generalizeditemset is characterized by a correlation type (positive, negative, or null) according to the strength of thecorrelation among its items.This paper presents a novel data mining approach to supporting different and interesting targetedanalysis - topic trend analysis, context-aware service profiling - by analyzing Twitter posts. We aim atdiscovering contrasting situations by means of generalized itemsets. Specifically, we focus on comparingitemsets discovered at different abstraction levels and we select large subsets of specific (descendant)itemsets that show correlation type changes with respect to their common ancestor. To this aim, a novelkind of pattern, namely the Strong Flipping Generalized Itemset (SFGI), is extracted from Twitter mes-sages and contextual information supplied with taxonomy hierarchies. Each SFGI consists of a frequentgeneralized itemset X and the set of its descendants showing a correlation type change with respect to X. Experiments performed on both real and synthetic datasets demonstrate the effectiveness of the pro-posed approach in discovering interesting and hidden knowledge from Twitter dat
    corecore