7,503 research outputs found
How to find frequent patterns?
An improved version of DF, the depth-first implementation of Apriori, is presented.Given a database of (e.g., supermarket) transactions, the DF algorithm builds a so-called trie that contains all frequent itemsets, i.e., all itemsets that are contained in at least `minsup' transactions with `minsup' a given threshold value.In the trie, there is a one-to-one correspondence between the paths and the frequent itemsets.The new version, called DF+, differs from DF in that its data structure representing the database is borrowed from the FP-growth algorithm. So it combines the compact FP-growth data structure with the efficient trie-building method in DF.
An Efficient Genetic Algorithm for Discovering Diverse-Frequent Patterns
Working with exhaustive search on large dataset is infeasible for several
reasons. Recently, developed techniques that made pattern set mining feasible
by a general solver with long execution time that supports heuristic search and
are limited to small datasets only. In this paper, we investigate an approach
which aims to find diverse set of patterns using genetic algorithm to mine
diverse frequent patterns. We propose a fast heuristic search algorithm that
outperforms state-of-the-art methods on a standard set of benchmarks and
capable to produce satisfactory results within a short period of time. Our
proposed algorithm uses a relative encoding scheme for the patterns and an
effective twin removal technique to ensure diversity throughout the search.Comment: 2015 International Conference on Electrical Engineering and
Information Communication Technology (ICEEICT
Identifying Patient Groups based on Frequent Patterns of Patient Samples
Grouping patients meaningfully can give insights about the different types of
patients, their needs, and the priorities. Finding groups that are meaningful
is however very challenging as background knowledge is often required to
determine what a useful grouping is. In this paper we propose an approach that
is able to find groups of patients based on a small sample of positive examples
given by a domain expert. Because of that, the approach relies on very limited
efforts by the domain experts. The approach groups based on the activities and
diagnostic/billing codes within health pathways of patients. To define such a
grouping based on the sample of patients efficiently, frequent patterns of
activities are discovered and used to measure the similarity between the care
pathways of other patients to the patients in the sample group. This approach
results in an insightful definition of the group. The proposed approach is
evaluated using several datasets obtained from a large university medical
center. The evaluation shows F1-scores of around 0.7 for grouping kidney injury
and around 0.6 for diabetes
How to find frequent patterns?
An improved version of DF, the depth-first implementation of Apriori, is presented.
Given a database of (e.g., supermarket) transactions, the DF algorithm builds a so-called trie that contains all frequent itemsets, i.e., all itemsets that are contained in at least `minsup' transactions with `minsup' a given threshold value.
In the trie, there is a one-to-one correspondence between the paths and the frequent itemsets.
The new version, called DF+, differs from DF in that its data structure representing the database is borrowed from the FP-growth algorithm. So it combines the compact FP-growth data structure with the efficient trie-building method in DF
Classification and Target Group Selection Based Upon Frequent Patterns
In this technical report , two new algorithms based upon frequent patterns are proposed. One algorithm is a classification method. The other one is an algorithm for target group selection. In both algorithms, first of all, the collection of frequent patterns in the training set is constructed. Choosing an appropriate data structure allows us to keep the full collection of frequent patterns in memory. The classification method utilizes directly this collection. Target group selection is a known problem in direct marketing. Our selection algorithm is based upon the collection of frequent patterns.classification;association rules;frequent item sets;target group selection
- …