Search CORE

45,137 research outputs found

A network algorithm to discover sequential patterns

Author: Cavique Luís
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

This paper addresses the discovery of sequential patterns in very large databases. Most of the existing algorithms use lattice structures in the space search that are very demanding computationally. The output of these algorithms generates a large number of rules. The aim of this work is to create a swift algorithm for the discovery of sequential patterns with a low time complexity. In this work, we also want to define tools that allow us to simplify the work of the final user, by offering a new visualization of the sequences, while bypassing the analysis of thousands of association rules

Repositório Aberto da Universidade Aberta

iWAP: ASingle Pass Approach for Web Access Sequential Pattern Mining

Author: . Byeong-Soo Jeong
. Chowdhury Farhan Ahmed
. Nafisah Islam
. Tarannum Shaila Zaman
Publication venue: GSTF Journal on Computing (JoC)
Publication date: 28/08/2014
Field of study

With the explosive growth of data availability on the World Wide Web, web usage mining becomes very essential for improving designs of websites, analyzing system performance as well as network communications, understanding user reaction, motivation and building adaptive websites. Web Access Pattern mining (WAP-mine) is a sequential pattern mining technique for discovering frequent web log access sequences. It first stores the frequent part of original web access sequence database on a prefix tree called WAP-tree and mines the frequent sequences from that tree according to a user given minimum support threshold. Therefore, this method is not applicable for incremental and interactive mining. In this paper, we propose an algorithm, improved Web Access Pattern (iWAP) mining, to find web access patterns from web logs more efficiently than the WAP-mine algorithm. Our proposed approach can discover all web access sequential patterns with a single pass of web log databases. Moreover, it is applicable for interactive and incremental mining which are not provided by the earlier one. The experimental and performance studies show that the proposed algorithm is in general an order of magnitude faster than the existing WAP-mine algorithm

GSTF Digital Library (GSTF-DL): Open Journal Systems (Global Science and Technology Forum)

Dynamic load balancing for the distributed mining of molecular structures

Author: Berthold M.R.
Di Fatta Giuseppe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids

KOPS - The Institutional Repository of the University of Konstanz

Central Archive at the University of Reading

Crossref

Mining Target-Oriented Sequential Patterns with Time-Intervals

Author: Chueh Hao-En
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 05/09/2010
Field of study

A target-oriented sequential pattern is a sequential pattern with a concerned itemset in the end of pattern. A time-interval sequential pattern is a sequential pattern with time-intervals between every pair of successive itemsets. In this paper we present an algorithm to discover target-oriented sequential pattern with time-intervals. To this end, the original sequences are reversed so that the last itemsets can be arranged in front of the sequences. The contrasts between reversed sequences and the concerned itemset are then used to exclude the irrelevant sequences. Clustering analysis is used with typical sequential pattern mining algorithm to extract the sequential patterns with time-intervals between successive itemsets. Finally, the discovered time-interval sequential patterns are reversed again to the original order for searching the target patterns.Comment: 11 pages, 9 table

arXiv.org e-Print Archive

Crossref