283 research outputs found

    Mining High Utility Itemsets with Regular Occurrence

    Get PDF
    High utility itemset mining (HUIM) plays an important role in the data mining community and in a wide range of applications. For example, in retail business it is used for finding sets of sold products that give high profit, low cost, etc. These itemsets can help improve marketing strategies, make promotions/ advertisements, etc. However, since HUIM only considers utility values of items/itemsets, it may not be sufficient to observe product-buying behavior of customers such as information related to "regular purchases of sets of products having a high profit margin". To address this issue, the occurrence behavior of itemsets (in the term of regularity) simultaneously with their utility values was investigated. Then, the problem of mining high utility itemsets with regular occurrence (MHUIR) to find sets of co-occurrence items with high utility values and regular occurrence in a database was considered. An efficient single-pass algorithm, called MHUIRA, was introduced. A new modified utility-list structure, called NUL, was designed to efficiently maintain utility values and occurrence information and to increase the efficiency of computing the utility of itemsets. Experimental studies on real and synthetic datasets and complexity analyses are provided to show the efficiency of MHUIRA combined with NUL in terms of time and space usage for mining interesting itemsets based on regularity and utility constraints

    Misleading Generalized Itemset discovery

    Get PDF
    Frequent generalized itemset mining is a data mining technique utilized to discover a high-level view of interesting knowledge hidden in the analyzed data. By exploiting a taxonomy, patterns are usually extracted at any level of abstraction. However, some misleading high-level patterns could be included in the mined set. This paper proposes a novel generalized itemset type, namely the Misleading Generalized Itemset (MGI). Each MGI represents a frequent generalized itemset X and its set E of low-level frequent descendants for which the correlation type is in contrast to the one of X. To allow experts to analyze the misleading high-level data correlations separately and exploit such knowledge by making different decisions, MGIs are extracted only if the low-level descendant itemsets that represent contrasting correlations cover almost the same portion of data as the high-level (misleading) ancestor. An algorithm to mine MGIs at the top of traditional generalized itemsets is also proposed. The experiments performed on both real and synthetic datasets demonstrate the effectiveness and efficiency of the proposed approac

    Mining Traversal Patterns from Weighted Traversals and Graph

    Get PDF
    ์‹ค์„ธ๊ณ„์˜ ๋งŽ์€ ๋ฌธ์ œ๋“ค์€ ๊ทธ๋ž˜ํ”„์™€ ๊ทธ ๊ทธ๋ž˜ํ”„๋ฅผ ์ˆœํšŒํ•˜๋Š” ํŠธ๋žœ์žญ์…˜์œผ๋กœ ๋ชจ๋ธ๋ง๋  ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด, ์›น ํŽ˜์ด์ง€์˜ ์—ฐ๊ฒฐ๊ตฌ์กฐ๋Š” ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„๋  ์ˆ˜ ์žˆ๊ณ , ์‚ฌ์šฉ์ž์˜ ์›น ํŽ˜์ด์ง€ ๋ฐฉ๋ฌธ๊ฒฝ๋กœ๋Š” ๊ทธ ๊ทธ๋ž˜ํ”„๋ฅผ ์ˆœํšŒํ•˜๋Š” ํŠธ๋žœ์žญ์…˜์œผ๋กœ ๋ชจ๋ธ๋ง๋  ์ˆ˜ ์žˆ๋‹ค. ์ด์™€ ๊ฐ™์ด ๊ทธ๋ž˜ํ”„๋ฅผ ์ˆœํšŒํ•˜๋Š” ํŠธ๋žœ์žญ์…˜์œผ๋กœ๋ถ€ํ„ฐ ์ค‘์š”ํ•˜๊ณ  ๊ฐ€์น˜ ์žˆ๋Š” ํŒจํ„ด์„ ์ฐพ์•„๋‚ด๋Š” ๊ฒƒ์€ ์˜๋ฏธ ์žˆ๋Š” ์ผ์ด๋‹ค. ์ด๋Ÿฌํ•œ ํŒจํ„ด์„ ์ฐพ๊ธฐ ์œ„ํ•œ ์ง€๊ธˆ๊นŒ์ง€์˜ ์—ฐ๊ตฌ์—์„œ๋Š” ์ˆœํšŒ๋‚˜ ๊ทธ๋ž˜ํ”„์˜ ๊ฐ€์ค‘์น˜๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ  ๋‹จ์ˆœํžˆ ๋นˆ๋ฐœํ•˜๋Š” ํŒจํ„ด๋งŒ์„ ์ฐพ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ด๋Ÿฌํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ํ•œ๊ณ„๋Š” ๋ณด๋‹ค ์‹ ๋ขฐ์„ฑ ์žˆ๊ณ  ์ •ํ™•ํ•œ ํŒจํ„ด์„ ํƒ์‚ฌํ•˜๋Š” ๋ฐ ์–ด๋ ค์›€์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ˆœํšŒ๋‚˜ ๊ทธ๋ž˜ํ”„์˜ ์ •์ ์— ๋ถ€์—ฌ๋œ ๊ฐ€์ค‘์น˜๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ํŒจํ„ด์„ ํƒ์‚ฌํ•˜๋Š” ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•๋“ค์„ ์ œ์•ˆํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์€ ๊ทธ๋ž˜ํ”„๋ฅผ ์ˆœํšŒํ•˜๋Š” ์ •๋ณด์— ๊ฐ€์ค‘์น˜๊ฐ€ ์กด์žฌํ•˜๋Š” ๊ฒฝ์šฐ์— ๋นˆ๋ฐœ ์ˆœํšŒ ํŒจํ„ด์„ ํƒ์‚ฌํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๊ทธ๋ž˜ํ”„ ์ˆœํšŒ์— ๋ถ€์—ฌ๋  ์ˆ˜ ์žˆ๋Š” ๊ฐ€์ค‘์น˜๋กœ๋Š” ๋‘ ๋„์‹œ๊ฐ„์˜ ์ด๋™ ์‹œ๊ฐ„์ด๋‚˜ ์›น ์‚ฌ์ดํŠธ๋ฅผ ๋ฐฉ๋ฌธํ•  ๋•Œ ํ•œ ํŽ˜์ด์ง€์—์„œ ๋‹ค๋ฅธ ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•˜๋Š” ์‹œ๊ฐ„ ๋“ฑ์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ข€ ๋” ์ •ํ™•ํ•œ ์ˆœํšŒ ํŒจํ„ด์„ ๋งˆ์ด๋‹ํ•˜๊ธฐ ์œ„ํ•ด ํ†ต๊ณ„ํ•™์˜ ์‹ ๋ขฐ ๊ตฌ๊ฐ„์„ ์ด์šฉํ•œ๋‹ค. ์ฆ‰, ์ „์ฒด ์ˆœํšŒ์˜ ๊ฐ ๊ฐ„์„ ์— ๋ถ€์—ฌ๋œ ๊ฐ€์ค‘์น˜๋กœ๋ถ€ํ„ฐ ์‹ ๋ขฐ ๊ตฌ๊ฐ„์„ ๊ตฌํ•œ ํ›„ ์‹ ๋ขฐ ๊ตฌ๊ฐ„์˜ ๋‚ด์— ์žˆ๋Š” ์ˆœํšŒ๋งŒ์„ ์œ ํšจํ•œ ๊ฒƒ์œผ๋กœ ์ธ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ์ด๋Ÿฌํ•œ ๋ฐฉ๋ฒ•์„ ์ ์šฉํ•จ์œผ๋กœ์จ ๋”์šฑ ์‹ ๋ขฐ์„ฑ ์žˆ๋Š” ์ˆœํšŒ ํŒจํ„ด์„ ๋งˆ์ด๋‹ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ ์ด๋ ‡๊ฒŒ ๊ตฌํ•œ ํŒจํ„ด๊ณผ ๊ทธ๋ž˜ํ”„ ์ •๋ณด๋ฅผ ์ด์šฉํ•˜์—ฌ ํŒจํ„ด ๊ฐ„์˜ ์šฐ์„ ์ˆœ์œ„๋ฅผ ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•๊ณผ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜๋„ ์ œ์‹œํ•œ๋‹ค. ๋‘ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์€ ๊ทธ๋ž˜ํ”„์˜ ์ •์ ์— ๊ฐ€์ค‘์น˜๊ฐ€ ๋ถ€์—ฌ๋œ ๊ฒฝ์šฐ์— ๊ฐ€์ค‘์น˜๊ฐ€ ๊ณ ๋ ค๋œ ๋นˆ๋ฐœ ์ˆœํšŒ ํŒจํ„ด์„ ํƒ์‚ฌํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ๊ทธ๋ž˜ํ”„์˜ ์ •์ ์— ๋ถ€์—ฌ๋  ์ˆ˜ ์žˆ๋Š” ๊ฐ€์ค‘์น˜๋กœ๋Š” ์›น ์‚ฌ์ดํŠธ ๋‚ด์˜ ๊ฐ ๋ฌธ์„œ์˜ ์ •๋ณด๋Ÿ‰์ด๋‚˜ ์ค‘์š”๋„ ๋“ฑ์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋ฌธ์ œ์—์„œ๋Š” ๋นˆ๋ฐœ ์ˆœํšŒ ํŒจํ„ด์„ ๊ฒฐ์ •ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ํŒจํ„ด์˜ ๋ฐœ์ƒ ๋นˆ๋„๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋ฐฉ๋ฌธํ•œ ์ •์ ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ๋™์‹œ์— ๊ณ ๋ คํ•˜์—ฌ์•ผ ํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ •์ ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์ด์šฉํ•˜์—ฌ ํ–ฅํ›„์— ๋นˆ๋ฐœ ํŒจํ„ด์ด ๋  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋Š” ํ›„๋ณด ํŒจํ„ด์€ ๊ฐ ๋งˆ์ด๋‹ ๋‹จ๊ณ„์—์„œ ์ œ๊ฑฐํ•˜์ง€ ์•Š๊ณ  ์œ ์ง€ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ๋˜ํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ํ›„๋ณด ํŒจํ„ด์˜ ์ˆ˜๋ฅผ ๊ฐ์†Œ์‹œํ‚ค๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜๋„ ์ œ์•ˆํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์‹คํ—˜์„ ํ†ตํ•˜์—ฌ ์ˆ˜ํ–‰ ์‹œ๊ฐ„ ๋ฐ ์ƒ์„ฑ๋˜๋Š” ํŒจํ„ด์˜ ์ˆ˜ ๋“ฑ์„ ๋น„๊ต ๋ถ„์„ํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ˆœํšŒ์— ๊ฐ€์ค‘์น˜๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ์™€ ๊ทธ๋ž˜ํ”„์˜ ์ •์ ์— ๊ฐ€์ค‘์น˜๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ์— ๋นˆ๋ฐœ ์ˆœํšŒ ํŒจํ„ด์„ ํƒ์‚ฌํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•๋“ค์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•๋“ค์„ ์›น ๋งˆ์ด๋‹๊ณผ ๊ฐ™์€ ๋ถ„์•ผ์— ์ ์šฉํ•จ์œผ๋กœ์จ ์›น ๊ตฌ์กฐ์˜ ํšจ์œจ์ ์ธ ๋ณ€๊ฒฝ์ด๋‚˜ ์›น ๋ฌธ์„œ์˜ ์ ‘๊ทผ ์†๋„ ํ–ฅ์ƒ, ์‚ฌ์šฉ์ž๋ณ„ ๊ฐœ์ธํ™”๋œ ์›น ๋ฌธ์„œ ๊ตฌ์ถ• ๋“ฑ์ด ๊ฐ€๋Šฅํ•  ๊ฒƒ์ด๋‹ค.Abstract โ…ถ Chapter 1 Introduction 1.1 Overview 1.2 Motivations 1.3 Approach 1.4 Organization of Thesis Chapter 2 Related Works 2.1 Itemset Mining 2.2 Weighted Itemset Mining 2.3 Traversal Mining 2.4 Graph Traversal Mining Chapter 3 Mining Patterns from Weighted Traversals on Unweighted Graph 3.1 Definitions and Problem Statements 3.2 Mining Frequent Patterns 3.2.1 Augmentation of Base Graph 3.2.2 In-Mining Algorithm 3.2.3 Pre-Mining Algorithm 3.2.4 Priority of Patterns 3.3 Experimental Results Chapter 4 Mining Patterns from Unweighted Traversals on Weighted Graph 4.1 Definitions and Problem Statements 4.2 Mining Weighted Frequent Patterns 4.2.1 Pruning by Support Bounds 4.2.2 Candidate Generation 4.2.3 Mining Algorithm 4.3 Estimation of Support Bounds 4.3.1 Estimation by All Vertices 4.3.2 Estimation by Reachable Vertices 4.4 Experimental Results Chapter 5 Conclusions and Further Works Reference

    Parallel Algorithms for Discovery of Association Rules

    Full text link

    A Safety Support System for Children\u27s Antiloss

    Get PDF
    In the recent past, crimes against children and the number of the missing children have been stayed at high. It is a tragic disaster for a family if their child is missing. Feeling safe about their children is very important for the parents. Therefore, there is an urgent requirement for safety support systems to prevent crimes against children and for anti-loss, particularly when the children are on their own, such as on the ways to and from schools. Thanks to the highly development of telecommunication and mobile technologies, preventive devices such as child ID kits, family trackers have come to light. However, they haven\u27t been impressive solutions yet as they only track current positions of the children and lack of intimations for the parents when their children are under potential dangers. In this thesis, a data mining framework is introduced, in which secure areas and secure paths of the children are learned based on their location histories. When the system predicts the children to be potentially unsafe (e.g., in a strange area or on a strange route), automatic reports will be sent to their parents. Furthermore, an indoor positioning method utilizing Bluetooth is also proposed. Based on the android platform, a prototype of the application for both children and parents is developed incorporating with the proposed techniques in this thesis

    Parallel Mining of Association Rules Using a Lattice Based Approach

    Get PDF
    The discovery of interesting patterns from database transactions is one of the major problems in knowledge discovery in database. One such interesting pattern is the association rules extracted from these transactions. Parallel algorithms are required for the mining of association rules due to the very large databases used to store the transactions. In this paper we present a parallel algorithm for the mining of association rules. We implemented a parallel algorithm that used a lattice approach for mining association rules. The Dynamic Distributed Rule Mining (DDRM) is a lattice-based algorithm that partitions the lattice into sublattices to be assigned to processors for processing and identification of frequent itemsets. Experimental results show that DDRM utilizes the processors efficiently and performed better than the prefix-based and partition algorithms that use a static approach to assign classes to the processors. The DDRM algorithm scales well and shows good speedup

    Mining association rules for the quality improvement of the production process

    Get PDF
    Academics and practitioners have a common interest in the continuing development of methods and computer applications that support or perform knowledge-intensive engineering tasks. Operations management dysfunctions and lost production time are problems of enormous magnitude that impact the performance and quality of industrial systems as well as their cost of production. Association rule mining is a data mining technique used to find out useful and invaluable information from huge databases. This work develops a better conceptual base for improving the application of association rule mining methods to extract knowledge on operations and information management. The emphasis of the paper is on the improvement of the operations processes. The application example details an industrial experiment in which association rule mining is used to analyze the manufacturing process of a fully integrated provider of drilling products. The study reports some new interesting results with data mining and knowledge discovery techniques applied to a drill production process. Experimentโ€™s results on real-life data sets show that the proposed approach is useful in finding effective knowledge associated to dysfunctions causes
    • โ€ฆ
    corecore