14,888 research outputs found

    An Algorithm for Generating Non-Redundant Sequential Rules for Medical Time Series Data

    Get PDF
    In this paper, an algorithm for generating non-redundant sequential rules for the medical time series data is designed. This study is the continuation of my previous study titled οΏ½An Algorithm for Mining Closed Weighted Sequential Patterns with Flexing Time Interval for Medical Time Series DataοΏ½ [25]. In my previous work, the sequence weight for each sequence was calculated based on the time interval between the itemsets.Subsequently, the candidate sequences were generated with flexible time intervals initially. The next step was, computation of frequent sequential patterns with the aid of proposed support measure. Next the frequent sequential patterns were subjected to closure checking process which leads to filter the closed sequential patterns with flexible time intervals. Finally, the methodology produced with necessary sequential patterns was proved. This methodology constructed closed sequential patterns which was 23.2% lesser than the sequential patterns. In this study, the sequential rules are generated based on the calculation of confidence value of the rule from the closed sequential pattern. Once the closed sequential rules are generated which are subjected to non-redundant checking process, that leads to produce the final set of non-redundant weighted closed sequential rules with flexible time intervals. This study produces non-redundant sequential rules which is 172.37% lesser than sequential rules

    MINING TOP-K FREQUENT SEQUENTIAL PATTERN IN ITEM INTERVAL EXTENDED SEQUENCE DATABASE

    Get PDF
    Abstract. Frequent sequential pattern mining in item interval extended sequence database (iSDB) has been one of interesting task in recent years. Unlike classic frequent sequential pattern mining, the pattern mining in iSDB also consider the item interval between successive items; thus, it may extract more meaningful sequential patterns in real life. Most previous frequent sequential pattern mining in iSDB algorithms needs a minimum support threshold (minsup) to perform the mining. However, it’s not easy for users to provide an appropriate threshold in practice. The too high minsup value will lead to missing valuable patterns, while the too low minsup value may generate too many useless patterns. To address this problem, we propose an algorithm: TopKWFP – Top-k weighted frequent sequential pattern mining in item interval extended sequence database. Our algorithm doesn’t need to provide a fixed minsup value, this minsup value will dynamically raise during the mining proces

    Mining Traversal Patterns from Weighted Traversals and Graph

    Get PDF
    μ‹€μ„Έκ³„μ˜ λ§Žμ€ λ¬Έμ œλ“€μ€ κ·Έλž˜ν”„μ™€ κ·Έ κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œ λͺ¨λΈλ§λ  수 μžˆλ‹€. 예λ₯Ό λ“€λ©΄, μ›Ή νŽ˜μ΄μ§€μ˜ μ—°κ²°κ΅¬μ‘°λŠ” κ·Έλž˜ν”„λ‘œ ν‘œν˜„λ  수 있고, μ‚¬μš©μžμ˜ μ›Ή νŽ˜μ΄μ§€ λ°©λ¬Έκ²½λ‘œλŠ” κ·Έ κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œ λͺ¨λΈλ§λ  수 μžˆλ‹€. 이와 같이 κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œλΆ€ν„° μ€‘μš”ν•˜κ³  κ°€μΉ˜ μžˆλŠ” νŒ¨ν„΄μ„ μ°Ύμ•„λ‚΄λŠ” 것은 의미 μžˆλŠ” 일이닀. μ΄λŸ¬ν•œ νŒ¨ν„΄μ„ μ°ΎκΈ° μœ„ν•œ μ§€κΈˆκΉŒμ§€μ˜ μ—°κ΅¬μ—μ„œλŠ” μˆœνšŒλ‚˜ κ·Έλž˜ν”„μ˜ κ°€μ€‘μΉ˜λ₯Ό κ³ λ €ν•˜μ§€ μ•Šκ³  λ‹¨μˆœνžˆ λΉˆλ°œν•˜λŠ” νŒ¨ν„΄λ§Œμ„ μ°ΎλŠ” μ•Œκ³ λ¦¬μ¦˜μ„ μ œμ•ˆν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ μ•Œκ³ λ¦¬μ¦˜μ˜ ν•œκ³„λŠ” 보닀 μ‹ λ’°μ„± 있고 μ •ν™•ν•œ νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 데 어렀움이 μžˆλ‹€λŠ” 것이닀. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μˆœνšŒλ‚˜ κ·Έλž˜ν”„μ˜ 정점에 λΆ€μ—¬λœ κ°€μ€‘μΉ˜λ₯Ό κ³ λ €ν•˜μ—¬ νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 두 가지 방법듀을 μ œμ•ˆν•œλ‹€. 첫 번째 방법은 κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” 정보에 κ°€μ€‘μΉ˜κ°€ μ‘΄μž¬ν•˜λŠ” κ²½μš°μ— 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 것이닀. κ·Έλž˜ν”„ μˆœνšŒμ— 뢀여될 수 μžˆλŠ” κ°€μ€‘μΉ˜λ‘œλŠ” 두 λ„μ‹œκ°„μ˜ 이동 μ‹œκ°„μ΄λ‚˜ μ›Ή μ‚¬μ΄νŠΈλ₯Ό λ°©λ¬Έν•  λ•Œ ν•œ νŽ˜μ΄μ§€μ—μ„œ λ‹€λ₯Έ νŽ˜μ΄μ§€λ‘œ μ΄λ™ν•˜λŠ” μ‹œκ°„ 등이 될 수 μžˆλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ’€ 더 μ •ν™•ν•œ 순회 νŒ¨ν„΄μ„ λ§ˆμ΄λ‹ν•˜κΈ° μœ„ν•΄ ν†΅κ³„ν•™μ˜ μ‹ λ’° ꡬ간을 μ΄μš©ν•œλ‹€. 즉, 전체 순회의 각 간선에 λΆ€μ—¬λœ κ°€μ€‘μΉ˜λ‘œλΆ€ν„° μ‹ λ’° ꡬ간을 κ΅¬ν•œ ν›„ μ‹ λ’° κ΅¬κ°„μ˜ 내에 μžˆλŠ” μˆœνšŒλ§Œμ„ μœ νš¨ν•œ κ²ƒμœΌλ‘œ μΈμ •ν•˜λŠ” 방법이닀. μ΄λŸ¬ν•œ 방법을 μ μš©ν•¨μœΌλ‘œμ¨ λ”μš± μ‹ λ’°μ„± μžˆλŠ” 순회 νŒ¨ν„΄μ„ λ§ˆμ΄λ‹ν•  수 μžˆλ‹€. λ˜ν•œ μ΄λ ‡κ²Œ κ΅¬ν•œ νŒ¨ν„΄κ³Ό κ·Έλž˜ν”„ 정보λ₯Ό μ΄μš©ν•˜μ—¬ νŒ¨ν„΄ κ°„μ˜ μš°μ„ μˆœμœ„λ₯Ό κ²°μ •ν•  수 μžˆλŠ” 방법과 μ„±λŠ₯ ν–₯상을 μœ„ν•œ μ•Œκ³ λ¦¬μ¦˜λ„ μ œμ‹œν•œλ‹€. 두 번째 방법은 κ·Έλž˜ν”„μ˜ 정점에 κ°€μ€‘μΉ˜κ°€ λΆ€μ—¬λœ κ²½μš°μ— κ°€μ€‘μΉ˜κ°€ 고렀된 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 방법이닀. κ·Έλž˜ν”„μ˜ 정점에 뢀여될 수 μžˆλŠ” κ°€μ€‘μΉ˜λ‘œλŠ” μ›Ή μ‚¬μ΄νŠΈ λ‚΄μ˜ 각 λ¬Έμ„œμ˜ μ •λ³΄λŸ‰μ΄λ‚˜ μ€‘μš”λ„ 등이 될 수 μžˆλ‹€. 이 λ¬Έμ œμ—μ„œλŠ” 빈발 순회 νŒ¨ν„΄μ„ κ²°μ •ν•˜κΈ° μœ„ν•˜μ—¬ νŒ¨ν„΄μ˜ λ°œμƒ λΉˆλ„λΏλ§Œ μ•„λ‹ˆλΌ λ°©λ¬Έν•œ μ •μ μ˜ κ°€μ€‘μΉ˜λ₯Ό λ™μ‹œμ— κ³ λ €ν•˜μ—¬μ•Ό ν•œλ‹€. 이λ₯Ό μœ„ν•΄ λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ •μ μ˜ κ°€μ€‘μΉ˜λ₯Ό μ΄μš©ν•˜μ—¬ ν–₯후에 빈발 νŒ¨ν„΄μ΄ 될 κ°€λŠ₯성이 μžˆλŠ” 후보 νŒ¨ν„΄μ€ 각 λ§ˆμ΄λ‹ λ‹¨κ³„μ—μ„œ μ œκ±°ν•˜μ§€ μ•Šκ³  μœ μ§€ν•˜λŠ” μ•Œκ³ λ¦¬μ¦˜μ„ μ œμ•ˆν•œλ‹€. λ˜ν•œ μ„±λŠ₯ ν–₯상을 μœ„ν•΄ 후보 νŒ¨ν„΄μ˜ 수λ₯Ό κ°μ†Œμ‹œν‚€λŠ” μ•Œκ³ λ¦¬μ¦˜λ„ μ œμ•ˆν•œλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œ μ œμ•ˆν•œ 두 가지 방법에 λŒ€ν•˜μ—¬ λ‹€μ–‘ν•œ μ‹€ν—˜μ„ ν†΅ν•˜μ—¬ μˆ˜ν–‰ μ‹œκ°„ 및 μƒμ„±λ˜λŠ” νŒ¨ν„΄μ˜ 수 등을 비ꡐ λΆ„μ„ν•˜μ˜€λ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μˆœνšŒμ— κ°€μ€‘μΉ˜κ°€ μžˆλŠ” κ²½μš°μ™€ κ·Έλž˜ν”„μ˜ 정점에 κ°€μ€‘μΉ˜κ°€ μžˆλŠ” κ²½μš°μ— 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” μƒˆλ‘œμš΄ 방법듀을 μ œμ•ˆν•˜μ˜€λ‹€. μ œμ•ˆν•œ 방법듀을 μ›Ή λ§ˆμ΄λ‹κ³Ό 같은 뢄야에 μ μš©ν•¨μœΌλ‘œμ¨ μ›Ή ꡬ쑰의 효율적인 λ³€κ²½μ΄λ‚˜ μ›Ή λ¬Έμ„œμ˜ μ ‘κ·Ό 속도 ν–₯상, μ‚¬μš©μžλ³„ κ°œμΈν™”λœ μ›Ή λ¬Έμ„œ ꡬ좕 등이 κ°€λŠ₯ν•  것이닀.Abstract β…Ά Chapter 1 Introduction 1.1 Overview 1.2 Motivations 1.3 Approach 1.4 Organization of Thesis Chapter 2 Related Works 2.1 Itemset Mining 2.2 Weighted Itemset Mining 2.3 Traversal Mining 2.4 Graph Traversal Mining Chapter 3 Mining Patterns from Weighted Traversals on Unweighted Graph 3.1 Definitions and Problem Statements 3.2 Mining Frequent Patterns 3.2.1 Augmentation of Base Graph 3.2.2 In-Mining Algorithm 3.2.3 Pre-Mining Algorithm 3.2.4 Priority of Patterns 3.3 Experimental Results Chapter 4 Mining Patterns from Unweighted Traversals on Weighted Graph 4.1 Definitions and Problem Statements 4.2 Mining Weighted Frequent Patterns 4.2.1 Pruning by Support Bounds 4.2.2 Candidate Generation 4.2.3 Mining Algorithm 4.3 Estimation of Support Bounds 4.3.1 Estimation by All Vertices 4.3.2 Estimation by Reachable Vertices 4.4 Experimental Results Chapter 5 Conclusions and Further Works Reference

    Hierarchies of Weighted Closed Partially-Ordered Patterns for Enhancing Sequential Data Analysis

    Get PDF
    International audienceDiscovering sequential patterns in sequence databases is an important data mining task. Recently, hierarchies of closed partially-ordered patterns (cpo-patterns), built directly using Relational Concept Analysis (RCA), have been proposed to simplify the interpretation step by highlighting how cpo-patterns relate to each other. However, there are practical cases (e.g. choosing interesting navigation paths in the obtained hierarchies) when these hierarchies are still insufficient for the expert. To address these cases, we propose to extract hierarchies of more informative cpo-patterns, namely weighted cpo-patterns (wcpo-patterns), by extending the RCA-based approach. These wcpo-patterns capture and explicitly show not only the order on itemsets but also their different influence on the analysed sequences. We illustrate how the proposed wcpo-patterns can enhance sequential data analysis on a toy example

    Feature-based time-series analysis

    Full text link
    This work presents an introduction to feature-based time-series analysis. The time series as a data type is first described, along with an overview of the interdisciplinary time-series analysis literature. I then summarize the range of feature-based representations for time series that have been developed to aid interpretable insights into time-series structure. Particular emphasis is given to emerging research that facilitates wide comparison of feature-based representations that allow us to understand the properties of a time-series dataset that make it suited to a particular feature-based representation or analysis algorithm. The future of time-series analysis is likely to embrace approaches that exploit machine learning methods to partially automate human learning to aid understanding of the complex dynamical patterns in the time series we measure from the world.Comment: 28 pages, 9 figure

    Sequential Pattern Mining with Multidimensional Interval Items

    Get PDF
    In real sequence pattern mining scenarios, the interval information between two item sets is very important. However, although existing algorithms can effectively mine frequent subsequence sets, the interval information is ignored. This paper aims to mine sequential patterns with multidimensional interval items in sequence databases. In order to address this problem, this paper defines and specifies the interval event problem in the sequential pattern mining task. Then, the interval event items framework is proposed to handle the multidimensional interval event items. Moreover, the MII-Prefixspan algorithm is introduced for the sequential pattern with multidimensional interval event items mining tasks. This algorithm adds the processing of interval event items in the mining process. We can get richer and more in line with actual needs information from mined sequence patterns through these methods. This scheme is applied to the actual website behaviour analysis task to obtain more valuable information for web optimization and provide more valuable sequence pattern information for practical problems. This work also opens a new pathway toward more efficient sequential pattern mining tasks
    • …
    corecore