17,702 research outputs found
An efficient parallel method for mining frequent closed sequential patterns
Mining frequent closed sequential pattern (FCSPs) has attracted a great deal of research attention, because it is an important task in sequences mining. In recently, many studies have focused on mining frequent closed sequential patterns because, such patterns have proved to be more efficient and compact than frequent sequential patterns. Information can be fully extracted from frequent closed sequential patterns. In this paper, we propose an efficient parallel approach called parallel dynamic bit vector frequent closed sequential patterns (pDBV-FCSP) using multi-core processor architecture for mining FCSPs from large databases. The pDBV-FCSP divides the search space to reduce the required storage space and performs closure checking of prefix sequences early to reduce execution time for mining frequent closed sequential patterns. This approach overcomes the problems of parallel mining such as overhead of communication, synchronization, and data replication. It also solves the load balance issues of the workload between the processors with a dynamic mechanism that re-distributes the work, when some processes are out of work to minimize the idle CPU time.Web of Science5174021739
Algorithms for Extracting Frequent Episodes in the Process of Temporal Data Mining
An important aspect in the data mining process is the discovery of patterns having a great influence on the studied problem. The purpose of this paper is to study the frequent episodes data mining through the use of parallel pattern discovery algorithms. Parallel pattern discovery algorithms offer better performance and scalability, so they are of a great interest for the data mining research community. In the following, there will be highlighted some parallel and distributed frequent pattern mining algorithms on various platforms and it will also be presented a comparative study of their main features. The study takes into account the new possibilities that arise along with the emerging novel Compute Unified Device Architecture from the latest generation of graphics processing units. Based on their high performance, low cost and the increasing number of features offered, GPU processors are viable solutions for an optimal implementation of frequent pattern mining algorithmsFrequent Pattern Mining, Parallel Computing, Dynamic Load Balancing, Temporal Data Mining, CUDA, GPU, Fermi, Thread
Efficient chain structure for high-utility sequential pattern mining
High-utility sequential pattern mining (HUSPM) is an emerging topic in data mining, which considers both utility and sequence factors to derive the set of high-utility sequential patterns (HUSPs) from the quantitative databases. Several works have been presented to reduce the computational cost by variants of pruning strategies. In this paper, we present an efficient sequence-utility (SU)-chain structure, which can be used to store more relevant information to improve mining performance. Based on the SU-Chain structure, the existing pruning strategies can also be utilized here to early prune the unpromising candidates and obtain the satisfied HUSPs. Experiments are then compared with the state-of-the-art HUSPM algorithms and the results showed that the SU-Chain-based model can efficiently improve the efficiency performance than the existing HUSPM algorithms in terms of runtime and number of the determined candidates
Recursion Aware Modeling and Discovery For Hierarchical Software Event Log Analysis (Extended)
This extended paper presents 1) a novel hierarchy and recursion extension to
the process tree model; and 2) the first, recursion aware process model
discovery technique that leverages hierarchical information in event logs,
typically available for software systems. This technique allows us to analyze
the operational processes of software systems under real-life conditions at
multiple levels of granularity. The work can be positioned in-between reverse
engineering and process mining. An implementation of the proposed approach is
available as a ProM plugin. Experimental results based on real-life (software)
event logs demonstrate the feasibility and usefulness of the approach and show
the huge potential to speed up discovery by exploiting the available hierarchy.Comment: Extended version (14 pages total) of the paper Recursion Aware
Modeling and Discovery For Hierarchical Software Event Log Analysis. This
Technical Report version includes the guarantee proofs for the proposed
discovery algorithm
Temporalized logics and automata for time granularity
Suitable extensions of the monadic second-order theory of k successors have
been proposed in the literature to capture the notion of time granularity. In
this paper, we provide the monadic second-order theories of downward unbounded
layered structures, which are infinitely refinable structures consisting of a
coarsest domain and an infinite number of finer and finer domains, and of
upward unbounded layered structures, which consist of a finest domain and an
infinite number of coarser and coarser domains, with expressively complete and
elementarily decidable temporal logic counterparts.
We obtain such a result in two steps. First, we define a new class of
combined automata, called temporalized automata, which can be proved to be the
automata-theoretic counterpart of temporalized logics, and show that relevant
properties, such as closure under Boolean operations, decidability, and
expressive equivalence with respect to temporal logics, transfer from component
automata to temporalized ones. Then, we exploit the correspondence between
temporalized logics and automata to reduce the task of finding the temporal
logic counterparts of the given theories of time granularity to the easier one
of finding temporalized automata counterparts of them.Comment: Journal: Theory and Practice of Logic Programming Journal Acronym:
TPLP Category: Paper for Special Issue (Verification and Computational Logic)
Submitted: 18 March 2002, revised: 14 Januari 2003, accepted: 5 September
200
- …