10,074 research outputs found
Mining Traversal Patterns from Weighted Traversals and Graph
μ€μΈκ³μ λ§μ λ¬Έμ λ€μ κ·Έλνμ κ·Έ κ·Έλνλ₯Ό μννλ νΈλμμ
μΌλ‘ λͺ¨λΈλ§λ μ μλ€. μλ₯Ό λ€λ©΄, μΉ νμ΄μ§μ μ°κ²°κ΅¬μ‘°λ κ·Έλνλ‘ ννλ μ μκ³ , μ¬μ©μμ μΉ νμ΄μ§ λ°©λ¬Έκ²½λ‘λ κ·Έ κ·Έλνλ₯Ό μννλ νΈλμμ
μΌλ‘ λͺ¨λΈλ§λ μ μλ€. μ΄μ κ°μ΄ κ·Έλνλ₯Ό μννλ νΈλμμ
μΌλ‘λΆν° μ€μνκ³ κ°μΉ μλ ν¨ν΄μ μ°Ύμλ΄λ κ²μ μλ―Έ μλ μΌμ΄λ€. μ΄λ¬ν ν¨ν΄μ μ°ΎκΈ° μν μ§κΈκΉμ§μ μ°κ΅¬μμλ μνλ κ·Έλνμ κ°μ€μΉλ₯Ό κ³ λ €νμ§ μκ³ λ¨μν λΉλ°νλ ν¨ν΄λ§μ μ°Ύλ μκ³ λ¦¬μ¦μ μ μνμλ€. μ΄λ¬ν μκ³ λ¦¬μ¦μ νκ³λ λ³΄λ€ μ λ’°μ± μκ³ μ νν ν¨ν΄μ νμ¬νλ λ° μ΄λ €μμ΄ μλ€λ κ²μ΄λ€.
λ³Έ λ
Όλ¬Έμμλ μνλ κ·Έλνμ μ μ μ λΆμ¬λ κ°μ€μΉλ₯Ό κ³ λ €νμ¬ ν¨ν΄μ νμ¬νλ λ κ°μ§ λ°©λ²λ€μ μ μνλ€. 첫 λ²μ§Έ λ°©λ²μ κ·Έλνλ₯Ό μννλ μ 보μ κ°μ€μΉκ° μ‘΄μ¬νλ κ²½μ°μ λΉλ° μν ν¨ν΄μ νμ¬νλ κ²μ΄λ€. κ·Έλν μνμ λΆμ¬λ μ μλ κ°μ€μΉλ‘λ λ λμκ°μ μ΄λ μκ°μ΄λ μΉ μ¬μ΄νΈλ₯Ό λ°©λ¬Έν λ ν νμ΄μ§μμ λ€λ₯Έ νμ΄μ§λ‘ μ΄λνλ μκ° λ±μ΄ λ μ μλ€. λ³Έ λ
Όλ¬Έμμλ μ’ λ μ νν μν ν¨ν΄μ λ§μ΄λνκΈ° μν΄ ν΅κ³νμ μ λ’° ꡬκ°μ μ΄μ©νλ€. μ¦, μ 체 μνμ κ° κ°μ μ λΆμ¬λ κ°μ€μΉλ‘λΆν° μ λ’° ꡬκ°μ ꡬν ν μ λ’° ꡬκ°μ λ΄μ μλ μνλ§μ μ ν¨ν κ²μΌλ‘ μΈμ νλ λ°©λ²μ΄λ€. μ΄λ¬ν λ°©λ²μ μ μ©ν¨μΌλ‘μ¨ λμ± μ λ’°μ± μλ μν ν¨ν΄μ λ§μ΄λν μ μλ€. λν μ΄λ κ² κ΅¬ν ν¨ν΄κ³Ό κ·Έλν μ 보λ₯Ό μ΄μ©νμ¬ ν¨ν΄ κ°μ μ°μ μμλ₯Ό κ²°μ ν μ μλ λ°©λ²κ³Ό μ±λ₯ ν₯μμ μν μκ³ λ¦¬μ¦λ μ μνλ€.
λ λ²μ§Έ λ°©λ²μ κ·Έλνμ μ μ μ κ°μ€μΉκ° λΆμ¬λ κ²½μ°μ κ°μ€μΉκ° κ³ λ €λ λΉλ° μν ν¨ν΄μ νμ¬νλ λ°©λ²μ΄λ€. κ·Έλνμ μ μ μ λΆμ¬λ μ μλ κ°μ€μΉλ‘λ μΉ μ¬μ΄νΈ λ΄μ κ° λ¬Έμμ μ 보λμ΄λ μ€μλ λ±μ΄ λ μ μλ€. μ΄ λ¬Έμ μμλ λΉλ° μν ν¨ν΄μ κ²°μ νκΈ° μνμ¬ ν¨ν΄μ λ°μ λΉλλΏλ§ μλλΌ λ°©λ¬Έν μ μ μ κ°μ€μΉλ₯Ό λμμ κ³ λ €νμ¬μΌ νλ€. μ΄λ₯Ό μν΄ λ³Έ λ
Όλ¬Έμμλ μ μ μ κ°μ€μΉλ₯Ό μ΄μ©νμ¬ ν₯νμ λΉλ° ν¨ν΄μ΄ λ κ°λ₯μ±μ΄ μλ ν보 ν¨ν΄μ κ° λ§μ΄λ λ¨κ³μμ μ κ±°νμ§ μκ³ μ μ§νλ μκ³ λ¦¬μ¦μ μ μνλ€. λν μ±λ₯ ν₯μμ μν΄ ν보 ν¨ν΄μ μλ₯Ό κ°μμν€λ μκ³ λ¦¬μ¦λ μ μνλ€.
λ³Έ λ
Όλ¬Έμμ μ μν λ κ°μ§ λ°©λ²μ λνμ¬ λ€μν μ€νμ ν΅νμ¬ μν μκ° λ° μμ±λλ ν¨ν΄μ μ λ±μ λΉκ΅ λΆμνμλ€.
λ³Έ λ
Όλ¬Έμμλ μνμ κ°μ€μΉκ° μλ κ²½μ°μ κ·Έλνμ μ μ μ κ°μ€μΉκ° μλ κ²½μ°μ λΉλ° μν ν¨ν΄μ νμ¬νλ μλ‘μ΄ λ°©λ²λ€μ μ μνμλ€. μ μν λ°©λ²λ€μ μΉ λ§μ΄λκ³Ό κ°μ λΆμΌμ μ μ©ν¨μΌλ‘μ¨ μΉ κ΅¬μ‘°μ ν¨μ¨μ μΈ λ³κ²½μ΄λ μΉ λ¬Έμμ μ κ·Ό μλ ν₯μ, μ¬μ©μλ³ κ°μΈνλ μΉ λ¬Έμ κ΅¬μΆ λ±μ΄ κ°λ₯ν κ²μ΄λ€.Abstract β
Ά
Chapter 1 Introduction
1.1 Overview
1.2 Motivations
1.3 Approach
1.4 Organization of Thesis
Chapter 2 Related Works
2.1 Itemset Mining
2.2 Weighted Itemset Mining
2.3 Traversal Mining
2.4 Graph Traversal Mining
Chapter 3 Mining Patterns from Weighted Traversals on
Unweighted Graph
3.1 Definitions and Problem Statements
3.2 Mining Frequent Patterns
3.2.1 Augmentation of Base Graph
3.2.2 In-Mining Algorithm
3.2.3 Pre-Mining Algorithm
3.2.4 Priority of Patterns
3.3 Experimental Results
Chapter 4 Mining Patterns from Unweighted Traversals on
Weighted Graph
4.1 Definitions and Problem Statements
4.2 Mining Weighted Frequent Patterns
4.2.1 Pruning by Support Bounds
4.2.2 Candidate Generation
4.2.3 Mining Algorithm
4.3 Estimation of Support Bounds
4.3.1 Estimation by All Vertices
4.3.2 Estimation by Reachable Vertices
4.4 Experimental Results
Chapter 5 Conclusions and Further Works
Reference
Extraction of High Utility Itemsets using Utility Pattern with Genetic Algorithm from OLTP System
To analyse vast amount of data, Frequent pattern mining play an important role in data mining. In practice, Frequent pattern mining cannot meet the challenges of real world problems due to items differ in various measures. Hence an emerging technique called Utility-based data mining is used in data mining processes.The utility mining not only considers the frequency but also see the utility associated with the itemsets.The main objective of utility mining is to extract the itemsets with high utilities, by considering user preferences such as profit,quantity and cost from OLTP systems. In our proposed approach, we are using UP growth with Genetic Algorithm. The idea is that UP growth algorithm would generate Potentially High Utility Itemsets and Genetic Algorithm would optimize and provide the High Utility Item set from it. On comparing with existing algorithm, the proposed approach is performing better in terms of memory utilization.
DOI: 10.17762/ijritcc2321-8169.15039
XML Schema Clustering with Semantic and Hierarchical Similarity Measures
With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis
Feature Extraction and Duplicate Detection for Text Mining: A Survey
Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Proce- ssing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algo- rithms are needed to extract useful features from huge amount of data. The survey covers different text summarization, classi- fication, clustering methods to discover useful features and also discovering query facets which are multiple groups of words or phrases that explain and summarize the content covered by a query thereby reducing time taken by the user. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. Hence we also review the literature on duplicate detection and data fusion (remove and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user
Discovering High Utility Itemsets using Hybrid Approach
Mining of high utility itemsets especially from the big transactional databases is time consuming task. For mining the high utility itemsets from large transactional datasets multiple methods are available and have some consequential limitations. In case of performance these methods need to be scrutinized under low memory based systems for mining high utility itemsets from transactional datasets as well as to address further measures. The proposed algorithm combines the High Utility Pattern Mining and Incremental Frequent Pattern Mining. Two algorithms used are Apriori and existing Parallel UP Growth for mining high utility itemsets using transactional databases. The information about high utility itemsets is maintained in a data structure called UP tree. These algorithms are not only used to scans the incremental database but also collects newly generated frequent itemsets support count. It provides fast execution because it includes new itemsets in tree and removes rare itemset from a utility pattern tree structure that reduces cost and time. From various Experimental analysis and results, this hybrid approach with existing Apriori and UP-Growth is proposed with aim of improving the performance
- β¦