Search CORE

14,888 research outputs found

An Algorithm for Generating Non-Redundant Sequential Rules for Medical Time Series Data

Author: K. Pazhanikumar, Dr. S. Arumugaperumal
Publication venue: Auricle Global Society of Education and Research
Publication date: 30/11/2017
Field of study

In this paper, an algorithm for generating non-redundant sequential rules for the medical time series data is designed. This study is the continuation of my previous study titled �An Algorithm for Mining Closed Weighted Sequential Patterns with Flexing Time Interval for Medical Time Series Data� [25]. In my previous work, the sequence weight for each sequence was calculated based on the time interval between the itemsets.Subsequently, the candidate sequences were generated with flexible time intervals initially. The next step was, computation of frequent sequential patterns with the aid of proposed support measure. Next the frequent sequential patterns were subjected to closure checking process which leads to filter the closed sequential patterns with flexible time intervals. Finally, the methodology produced with necessary sequential patterns was proved. This methodology constructed closed sequential patterns which was 23.2% lesser than the sequential patterns. In this study, the sequential rules are generated based on the calculation of confidence value of the rule from the closed sequential pattern. Once the closed sequential rules are generated which are subjected to non-redundant checking process, that leads to produce the final set of non-redundant weighted closed sequential rules with flexible time intervals. This study produces non-redundant sequential rules which is 172.37% lesser than sequential rules

International Journal on Future Revolution in Computer Science & Communication Engineering

MINING TOP-K FREQUENT SEQUENTIAL PATTERN IN ITEM INTERVAL EXTENDED SEQUENCE DATABASE

Author: Nguyen Thang Truong
Tran Anh The
Tran Duong Huy
Vu Thi Duc
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 23/11/2018
Field of study

Abstract. Frequent sequential pattern mining in item interval extended sequence database (iSDB) has been one of interesting task in recent years. Unlike classic frequent sequential pattern mining, the pattern mining in iSDB also consider the item interval between successive items; thus, it may extract more meaningful sequential patterns in real life. Most previous frequent sequential pattern mining in iSDB algorithms needs a minimum support threshold (minsup) to perform the mining. However, it’s not easy for users to provide an appropriate threshold in practice. The too high minsup value will lead to missing valuable patterns, while the too low minsup value may generate too many useless patterns. To address this problem, we propose an algorithm: TopKWFP – Top-k weighted frequent sequential pattern mining in item interval extended sequence database. Our algorithm doesn’t need to provide a fixed minsup value, this minsup value will dynamically raise during the mining proces

Vietnam Academy of Science and Technology: Journals Online

Mining Traversal Patterns from Weighted Traversals and Graph

Author: 이성대
Publication venue: 한국해양대학교
Publication date: 01/08/2007
Field of study

실세계의 많은 문제들은 그래프와 그 그래프를 순회하는 트랜잭션으로 모델링될 수 있다. 예를 들면, 웹 페이지의 연결구조는 그래프로 표현될 수 있고, 사용자의 웹 페이지 방문경로는 그 그래프를 순회하는 트랜잭션으로 모델링될 수 있다. 이와 같이 그래프를 순회하는 트랜잭션으로부터 중요하고 가치 있는 패턴을 찾아내는 것은 의미 있는 일이다. 이러한 패턴을 찾기 위한 지금까지의 연구에서는 순회나 그래프의 가중치를 고려하지 않고 단순히 빈발하는 패턴만을 찾는 알고리즘을 제안하였다. 이러한 알고리즘의 한계는 보다 신뢰성 있고 정확한 패턴을 탐사하는 데 어려움이 있다는 것이다. 본 논문에서는 순회나 그래프의 정점에 부여된 가중치를 고려하여 패턴을 탐사하는 두 가지 방법들을 제안한다. 첫 번째 방법은 그래프를 순회하는 정보에 가중치가 존재하는 경우에 빈발 순회 패턴을 탐사하는 것이다. 그래프 순회에 부여될 수 있는 가중치로는 두 도시간의 이동 시간이나 웹 사이트를 방문할 때 한 페이지에서 다른 페이지로 이동하는 시간 등이 될 수 있다. 본 논문에서는 좀 더 정확한 순회 패턴을 마이닝하기 위해 통계학의 신뢰 구간을 이용한다. 즉, 전체 순회의 각 간선에 부여된 가중치로부터 신뢰 구간을 구한 후 신뢰 구간의 내에 있는 순회만을 유효한 것으로 인정하는 방법이다. 이러한 방법을 적용함으로써 더욱 신뢰성 있는 순회 패턴을 마이닝할 수 있다. 또한 이렇게 구한 패턴과 그래프 정보를 이용하여 패턴 간의 우선순위를 결정할 수 있는 방법과 성능 향상을 위한 알고리즘도 제시한다. 두 번째 방법은 그래프의 정점에 가중치가 부여된 경우에 가중치가 고려된 빈발 순회 패턴을 탐사하는 방법이다. 그래프의 정점에 부여될 수 있는 가중치로는 웹 사이트 내의 각 문서의 정보량이나 중요도 등이 될 수 있다. 이 문제에서는 빈발 순회 패턴을 결정하기 위하여 패턴의 발생 빈도뿐만 아니라 방문한 정점의 가중치를 동시에 고려하여야 한다. 이를 위해 본 논문에서는 정점의 가중치를 이용하여 향후에 빈발 패턴이 될 가능성이 있는 후보 패턴은 각 마이닝 단계에서 제거하지 않고 유지하는 알고리즘을 제안한다. 또한 성능 향상을 위해 후보 패턴의 수를 감소시키는 알고리즘도 제안한다. 본 논문에서 제안한 두 가지 방법에 대하여 다양한 실험을 통하여 수행 시간 및 생성되는 패턴의 수 등을 비교 분석하였다. 본 논문에서는 순회에 가중치가 있는 경우와 그래프의 정점에 가중치가 있는 경우에 빈발 순회 패턴을 탐사하는 새로운 방법들을 제안하였다. 제안한 방법들을 웹 마이닝과 같은 분야에 적용함으로써 웹 구조의 효율적인 변경이나 웹 문서의 접근 속도 향상, 사용자별 개인화된 웹 문서 구축 등이 가능할 것이다.Abstract ⅶ Chapter 1 Introduction 1.1 Overview 1.2 Motivations 1.3 Approach 1.4 Organization of Thesis Chapter 2 Related Works 2.1 Itemset Mining 2.2 Weighted Itemset Mining 2.3 Traversal Mining 2.4 Graph Traversal Mining Chapter 3 Mining Patterns from Weighted Traversals on Unweighted Graph 3.1 Definitions and Problem Statements 3.2 Mining Frequent Patterns 3.2.1 Augmentation of Base Graph 3.2.2 In-Mining Algorithm 3.2.3 Pre-Mining Algorithm 3.2.4 Priority of Patterns 3.3 Experimental Results Chapter 4 Mining Patterns from Unweighted Traversals on Weighted Graph 4.1 Definitions and Problem Statements 4.2 Mining Weighted Frequent Patterns 4.2.1 Pruning by Support Bounds 4.2.2 Candidate Generation 4.2.3 Mining Algorithm 4.3 Estimation of Support Bounds 4.3.1 Estimation by All Vertices 4.3.2 Estimation by Reachable Vertices 4.4 Experimental Results Chapter 5 Conclusions and Further Works Reference

한국해양대학교(KMOU)

Hierarchies of Weighted Closed Partially-Ordered Patterns for Enhancing Sequential Data Analysis

Author: Braud Agnès
Le Ber Florence
Nica Cristina
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 13/06/2017
Field of study

International audienceDiscovering sequential patterns in sequence databases is an important data mining task. Recently, hierarchies of closed partially-ordered patterns (cpo-patterns), built directly using Relational Concept Analysis (RCA), have been proposed to simplify the interpretation step by highlighting how cpo-patterns relate to each other. However, there are practical cases (e.g. choosing interesting navigation paths in the obtained hierarchies) when these hierarchies are still insufficient for the expert. To address these cases, we propose to extract hierarchies of more informative cpo-patterns, namely weighted cpo-patterns (wcpo-patterns), by extending the RCA-based approach. These wcpo-patterns capture and explicitly show not only the order on itemsets but also their different influence on the analysed sequences. We illustrate how the proposed wcpo-patterns can enhance sequential data analysis on a toy example

INRIA a CCSD electronic archive server

Feature-based time-series analysis

Author: Fulcher Ben D.
Publication venue
Publication date: 01/10/2017
Field of study

This work presents an introduction to feature-based time-series analysis. The time series as a data type is first described, along with an overview of the interdisciplinary time-series analysis literature. I then summarize the range of feature-based representations for time series that have been developed to aid interpretable insights into time-series structure. Particular emphasis is given to emerging research that facilitates wide comparison of feature-based representations that allow us to understand the properties of a time-series dataset that make it suited to a particular feature-based representation or analysis algorithm. The future of time-series analysis is likely to embrace approaches that exploit machine learning methods to partially automate human learning to aid understanding of the complex dynamical patterns in the time series we measure from the world.Comment: 28 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Sequential Pattern Mining with Multidimensional Interval Items

Author: Chen Bob
Peng Weiming
Song Jihua
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2022
Field of study

In real sequence pattern mining scenarios, the interval information between two item sets is very important. However, although existing algorithms can effectively mine frequent subsequence sets, the interval information is ignored. This paper aims to mine sequential patterns with multidimensional interval items in sequence databases. In order to address this problem, this paper defines and specifies the interval event problem in the sequential pattern mining task. Then, the interval event items framework is proposed to handle the multidimensional interval event items. Moreover, the MII-Prefixspan algorithm is introduced for the sequential pattern with multidimensional interval event items mining tasks. This algorithm adds the processing of interval event items in the mining process. We can get richer and more in line with actual needs information from mined sequence patterns through these methods. This scheme is applied to the actual website behaviour analysis task to obtain more valuable information for web optimization and provide more valuable sequence pattern information for practical problems. This work also opens a new pathway toward more efficient sequential pattern mining tasks

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia