10,074 research outputs found

    Mining Traversal Patterns from Weighted Traversals and Graph

    Get PDF
    μ‹€μ„Έκ³„μ˜ λ§Žμ€ λ¬Έμ œλ“€μ€ κ·Έλž˜ν”„μ™€ κ·Έ κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œ λͺ¨λΈλ§λ  수 μžˆλ‹€. 예λ₯Ό λ“€λ©΄, μ›Ή νŽ˜μ΄μ§€μ˜ μ—°κ²°κ΅¬μ‘°λŠ” κ·Έλž˜ν”„λ‘œ ν‘œν˜„λ  수 있고, μ‚¬μš©μžμ˜ μ›Ή νŽ˜μ΄μ§€ λ°©λ¬Έκ²½λ‘œλŠ” κ·Έ κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œ λͺ¨λΈλ§λ  수 μžˆλ‹€. 이와 같이 κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” νŠΈλžœμž­μ…˜μœΌλ‘œλΆ€ν„° μ€‘μš”ν•˜κ³  κ°€μΉ˜ μžˆλŠ” νŒ¨ν„΄μ„ μ°Ύμ•„λ‚΄λŠ” 것은 의미 μžˆλŠ” 일이닀. μ΄λŸ¬ν•œ νŒ¨ν„΄μ„ μ°ΎκΈ° μœ„ν•œ μ§€κΈˆκΉŒμ§€μ˜ μ—°κ΅¬μ—μ„œλŠ” μˆœνšŒλ‚˜ κ·Έλž˜ν”„μ˜ κ°€μ€‘μΉ˜λ₯Ό κ³ λ €ν•˜μ§€ μ•Šκ³  λ‹¨μˆœνžˆ λΉˆλ°œν•˜λŠ” νŒ¨ν„΄λ§Œμ„ μ°ΎλŠ” μ•Œκ³ λ¦¬μ¦˜μ„ μ œμ•ˆν•˜μ˜€λ‹€. μ΄λŸ¬ν•œ μ•Œκ³ λ¦¬μ¦˜μ˜ ν•œκ³„λŠ” 보닀 μ‹ λ’°μ„± 있고 μ •ν™•ν•œ νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 데 어렀움이 μžˆλ‹€λŠ” 것이닀. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μˆœνšŒλ‚˜ κ·Έλž˜ν”„μ˜ 정점에 λΆ€μ—¬λœ κ°€μ€‘μΉ˜λ₯Ό κ³ λ €ν•˜μ—¬ νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 두 가지 방법듀을 μ œμ•ˆν•œλ‹€. 첫 번째 방법은 κ·Έλž˜ν”„λ₯Ό μˆœνšŒν•˜λŠ” 정보에 κ°€μ€‘μΉ˜κ°€ μ‘΄μž¬ν•˜λŠ” κ²½μš°μ— 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 것이닀. κ·Έλž˜ν”„ μˆœνšŒμ— 뢀여될 수 μžˆλŠ” κ°€μ€‘μΉ˜λ‘œλŠ” 두 λ„μ‹œκ°„μ˜ 이동 μ‹œκ°„μ΄λ‚˜ μ›Ή μ‚¬μ΄νŠΈλ₯Ό λ°©λ¬Έν•  λ•Œ ν•œ νŽ˜μ΄μ§€μ—μ„œ λ‹€λ₯Έ νŽ˜μ΄μ§€λ‘œ μ΄λ™ν•˜λŠ” μ‹œκ°„ 등이 될 수 μžˆλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ’€ 더 μ •ν™•ν•œ 순회 νŒ¨ν„΄μ„ λ§ˆμ΄λ‹ν•˜κΈ° μœ„ν•΄ ν†΅κ³„ν•™μ˜ μ‹ λ’° ꡬ간을 μ΄μš©ν•œλ‹€. 즉, 전체 순회의 각 간선에 λΆ€μ—¬λœ κ°€μ€‘μΉ˜λ‘œλΆ€ν„° μ‹ λ’° ꡬ간을 κ΅¬ν•œ ν›„ μ‹ λ’° κ΅¬κ°„μ˜ 내에 μžˆλŠ” μˆœνšŒλ§Œμ„ μœ νš¨ν•œ κ²ƒμœΌλ‘œ μΈμ •ν•˜λŠ” 방법이닀. μ΄λŸ¬ν•œ 방법을 μ μš©ν•¨μœΌλ‘œμ¨ λ”μš± μ‹ λ’°μ„± μžˆλŠ” 순회 νŒ¨ν„΄μ„ λ§ˆμ΄λ‹ν•  수 μžˆλ‹€. λ˜ν•œ μ΄λ ‡κ²Œ κ΅¬ν•œ νŒ¨ν„΄κ³Ό κ·Έλž˜ν”„ 정보λ₯Ό μ΄μš©ν•˜μ—¬ νŒ¨ν„΄ κ°„μ˜ μš°μ„ μˆœμœ„λ₯Ό κ²°μ •ν•  수 μžˆλŠ” 방법과 μ„±λŠ₯ ν–₯상을 μœ„ν•œ μ•Œκ³ λ¦¬μ¦˜λ„ μ œμ‹œν•œλ‹€. 두 번째 방법은 κ·Έλž˜ν”„μ˜ 정점에 κ°€μ€‘μΉ˜κ°€ λΆ€μ—¬λœ κ²½μš°μ— κ°€μ€‘μΉ˜κ°€ 고렀된 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” 방법이닀. κ·Έλž˜ν”„μ˜ 정점에 뢀여될 수 μžˆλŠ” κ°€μ€‘μΉ˜λ‘œλŠ” μ›Ή μ‚¬μ΄νŠΈ λ‚΄μ˜ 각 λ¬Έμ„œμ˜ μ •λ³΄λŸ‰μ΄λ‚˜ μ€‘μš”λ„ 등이 될 수 μžˆλ‹€. 이 λ¬Έμ œμ—μ„œλŠ” 빈발 순회 νŒ¨ν„΄μ„ κ²°μ •ν•˜κΈ° μœ„ν•˜μ—¬ νŒ¨ν„΄μ˜ λ°œμƒ λΉˆλ„λΏλ§Œ μ•„λ‹ˆλΌ λ°©λ¬Έν•œ μ •μ μ˜ κ°€μ€‘μΉ˜λ₯Ό λ™μ‹œμ— κ³ λ €ν•˜μ—¬μ•Ό ν•œλ‹€. 이λ₯Ό μœ„ν•΄ λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ •μ μ˜ κ°€μ€‘μΉ˜λ₯Ό μ΄μš©ν•˜μ—¬ ν–₯후에 빈발 νŒ¨ν„΄μ΄ 될 κ°€λŠ₯성이 μžˆλŠ” 후보 νŒ¨ν„΄μ€ 각 λ§ˆμ΄λ‹ λ‹¨κ³„μ—μ„œ μ œκ±°ν•˜μ§€ μ•Šκ³  μœ μ§€ν•˜λŠ” μ•Œκ³ λ¦¬μ¦˜μ„ μ œμ•ˆν•œλ‹€. λ˜ν•œ μ„±λŠ₯ ν–₯상을 μœ„ν•΄ 후보 νŒ¨ν„΄μ˜ 수λ₯Ό κ°μ†Œμ‹œν‚€λŠ” μ•Œκ³ λ¦¬μ¦˜λ„ μ œμ•ˆν•œλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œ μ œμ•ˆν•œ 두 가지 방법에 λŒ€ν•˜μ—¬ λ‹€μ–‘ν•œ μ‹€ν—˜μ„ ν†΅ν•˜μ—¬ μˆ˜ν–‰ μ‹œκ°„ 및 μƒμ„±λ˜λŠ” νŒ¨ν„΄μ˜ 수 등을 비ꡐ λΆ„μ„ν•˜μ˜€λ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μˆœνšŒμ— κ°€μ€‘μΉ˜κ°€ μžˆλŠ” κ²½μš°μ™€ κ·Έλž˜ν”„μ˜ 정점에 κ°€μ€‘μΉ˜κ°€ μžˆλŠ” κ²½μš°μ— 빈발 순회 νŒ¨ν„΄μ„ νƒμ‚¬ν•˜λŠ” μƒˆλ‘œμš΄ 방법듀을 μ œμ•ˆν•˜μ˜€λ‹€. μ œμ•ˆν•œ 방법듀을 μ›Ή λ§ˆμ΄λ‹κ³Ό 같은 뢄야에 μ μš©ν•¨μœΌλ‘œμ¨ μ›Ή ꡬ쑰의 효율적인 λ³€κ²½μ΄λ‚˜ μ›Ή λ¬Έμ„œμ˜ μ ‘κ·Ό 속도 ν–₯상, μ‚¬μš©μžλ³„ κ°œμΈν™”λœ μ›Ή λ¬Έμ„œ ꡬ좕 등이 κ°€λŠ₯ν•  것이닀.Abstract β…Ά Chapter 1 Introduction 1.1 Overview 1.2 Motivations 1.3 Approach 1.4 Organization of Thesis Chapter 2 Related Works 2.1 Itemset Mining 2.2 Weighted Itemset Mining 2.3 Traversal Mining 2.4 Graph Traversal Mining Chapter 3 Mining Patterns from Weighted Traversals on Unweighted Graph 3.1 Definitions and Problem Statements 3.2 Mining Frequent Patterns 3.2.1 Augmentation of Base Graph 3.2.2 In-Mining Algorithm 3.2.3 Pre-Mining Algorithm 3.2.4 Priority of Patterns 3.3 Experimental Results Chapter 4 Mining Patterns from Unweighted Traversals on Weighted Graph 4.1 Definitions and Problem Statements 4.2 Mining Weighted Frequent Patterns 4.2.1 Pruning by Support Bounds 4.2.2 Candidate Generation 4.2.3 Mining Algorithm 4.3 Estimation of Support Bounds 4.3.1 Estimation by All Vertices 4.3.2 Estimation by Reachable Vertices 4.4 Experimental Results Chapter 5 Conclusions and Further Works Reference

    Extraction of High Utility Itemsets using Utility Pattern with Genetic Algorithm from OLTP System

    Get PDF
    To analyse vast amount of data, Frequent pattern mining play an important role in data mining. In practice, Frequent pattern mining cannot meet the challenges of real world problems due to items differ in various measures. Hence an emerging technique called Utility-based data mining is used in data mining processes.The utility mining not only considers the frequency but also see the utility associated with the itemsets.The main objective of utility mining is to extract the itemsets with high utilities, by considering user preferences such as profit,quantity and cost from OLTP systems. In our proposed approach, we are using UP growth with Genetic Algorithm. The idea is that UP growth algorithm would generate Potentially High Utility Itemsets and Genetic Algorithm would optimize and provide the High Utility Item set from it. On comparing with existing algorithm, the proposed approach is performing better in terms of memory utilization. DOI: 10.17762/ijritcc2321-8169.15039

    XML Schema Clustering with Semantic and Hierarchical Similarity Measures

    Get PDF
    With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis

    Feature Extraction and Duplicate Detection for Text Mining: A Survey

    Get PDF
    Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Proce- ssing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algo- rithms are needed to extract useful features from huge amount of data. The survey covers different text summarization, classi- fication, clustering methods to discover useful features and also discovering query facets which are multiple groups of words or phrases that explain and summarize the content covered by a query thereby reducing time taken by the user. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. Hence we also review the literature on duplicate detection and data fusion (remove and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user

    Discovering High Utility Itemsets using Hybrid Approach

    Get PDF
    Mining of high utility itemsets especially from the big transactional databases is time consuming task. For mining the high utility itemsets from large transactional datasets multiple methods are available and have some consequential limitations. In case of performance these methods need to be scrutinized under low memory based systems for mining high utility itemsets from transactional datasets as well as to address further measures. The proposed algorithm combines the High Utility Pattern Mining and Incremental Frequent Pattern Mining. Two algorithms used are Apriori and existing Parallel UP Growth for mining high utility itemsets using transactional databases. The information about high utility itemsets is maintained in a data structure called UP tree. These algorithms are not only used to scans the incremental database but also collects newly generated frequent itemsets support count. It provides fast execution because it includes new itemsets in tree and removes rare itemset from a utility pattern tree structure that reduces cost and time. From various Experimental analysis and results, this hybrid approach with existing Apriori and UP-Growth is proposed with aim of improving the performance
    • …
    corecore