6,871 research outputs found
Mining frequent itemsets a perspective from operations research
Many papers on frequent itemsets have been published. Besides somecontests in this field were held. In the majority of the papers the focus ison speed. Ad hoc algorithms and datastructures were introduced. Inthis paper we put most of the algorithms in one framework, usingclassical Operations Research paradigms such as backtracking, depth-first andbreadth-first search, and branch-and-bound. Moreover we presentexperimental results where the different algorithms are implementedunder similar designs.data mining;operation research;Frequent itemsets
Constraining the Search Space in Temporal Pattern Mining
Agents in dynamic environments have to deal with complex situations including various temporal interrelations of actions and events. Discovering frequent patterns in such scenes can be useful in order to create prediction rules which can be used to predict future activities or situations. We present the algorithm MiTemP which learns frequent patterns based on a time intervalbased relational representation. Additionally the problem has also been transfered to a pure relational association rule mining task which can be handled by WARMR. The two approaches are compared in a number of experiments. The experiments show the advantage of avoiding the creation of impossible or redundant patterns with MiTemP. While less patterns have to be explored on average with MiTemP more frequent patterns are found at an earlier refinement level
Mining frequent itemsets a perspective from operations research
Many papers on frequent itemsets have been published. Besides some
contests in this field were held. In the majority of the papers the focus is
on speed. Ad hoc algorithms and datastructures were introduced. In
this paper we put most of the algorithms in one framework, using
classical Operations Research paradigms such as backtracking, depth-first and
breadth-first search, and branch-and-bound. Moreover we present
experimental results where the different algorithms are implemented
under similar designs
A new technique for intelligent web personal recommendation
Personal recommendation systems nowadays are very important in web applications
because of the available huge volume of information on the World Wide Web, and the
necessity to save users’ time, and provide appropriate desired information, knowledge,
items, etc. The most popular recommendation systems are collaborative filtering systems,
which suffer from certain problems such as cold-start, privacy, user identification, and
scalability. In this thesis, we suggest a new method to solve the cold start problem taking
into consideration the privacy issue. The method is shown to perform very well in
comparison with alternative methods, while having better properties regarding user privacy.
The cold start problem covers the situation when recommendation systems have not
sufficient information about a new user’s preferences (the user cold start problem), as well
as the case of newly added items to the system (the item cold start problem), in which case
the system will not be able to provide recommendations. Some systems use users’
demographical data as a basis for generating recommendations in such cases (e.g. the
Triadic Aspect method), but this solves only the user cold start problem and enforces user’s
privacy. Some systems use users’ ’stereotypes’ to generate recommendations, but
stereotypes often do not reflect the actual preferences of individual users. While some other
systems use user’s ’filterbots’ by injecting pseudo users or bots into the system and consider
these as existing ones, but this leads to poor accuracy.
We propose the active node method, that uses previous and recent users’ browsing targets
and browsing patterns to infer preferences and generate recommendations (node
recommendations, in which a single suggestion is given, and batch recommendations, in
which a set of possible target nodes are shown to the user at once). We compare the active
node method with three alternative methods (Triadic Aspect Method, Naïve Filterbots
Method, and MediaScout Stereotype Method), and we used a dataset collected from online
web news to generate recommendations based on our method and based on the three
alternative methods. We calculated the levels of novelty, coverage, and precision in these
experiments, and we found that our method achieves higher levels of novelty in batch
recommendation while achieving higher levels of coverage and precision in node
recommendations comparing to these alternative methods. Further, we develop a variant of
the active node method that incorporates semantic structure elements. A further
experimental evaluation with real data and users showed that semantic node
recommendation with the active node method achieved higher levels of novelty than nonsemantic
node recommendation, and semantic-batch recommendation achieved higher levels
of coverage and precision than non-semantic batch recommendation
Twitter data analysis by means of Strong Flipping Generalized Itemsets
Twitter data has recently been considered to perform a large variety of advanced analysis. Analysis ofTwitter data imposes new challenges because the data distribution is intrinsically sparse, due to a large number of messages post every day by using a wide vocabulary. Aimed at addressing this issue, generalized itemsets - sets of items at different abstraction levels - can be effectively mined and used todiscover interesting multiple-level correlations among data supplied with taxonomies. Each generalizeditemset is characterized by a correlation type (positive, negative, or null) according to the strength of thecorrelation among its items.This paper presents a novel data mining approach to supporting different and interesting targetedanalysis - topic trend analysis, context-aware service profiling - by analyzing Twitter posts. We aim atdiscovering contrasting situations by means of generalized itemsets. Specifically, we focus on comparingitemsets discovered at different abstraction levels and we select large subsets of specific (descendant)itemsets that show correlation type changes with respect to their common ancestor. To this aim, a novelkind of pattern, namely the Strong Flipping Generalized Itemset (SFGI), is extracted from Twitter mes-sages and contextual information supplied with taxonomy hierarchies. Each SFGI consists of a frequentgeneralized itemset X and the set of its descendants showing a correlation type change with respect to X. Experiments performed on both real and synthetic datasets demonstrate the effectiveness of the pro-posed approach in discovering interesting and hidden knowledge from Twitter dat
- …