Search CORE

7 research outputs found

Nomenclature and Contemporary Affirmation of the Unsupervised Learning in Text and Document Mining

Author: Annaluri Sreenivasa Rao
Prof. S. Ramakrishna
Publication venue: Global Journals Inc. (US)
Publication date: 21/02/2015
Field of study

Document clustering is primarily a method applied for an uncomplicated, document search, analysis and review of content or is a process of automatic classification of documents of similar type categorized to relevant clusters, in a clustering hierarchy. In this paper a review of the related work in the field of document clustering from the simple techniques of word and phrase to the present complex techniques of statistical analysis, machine learning etc are illustrated with their implications for future research work

Global Journal of Computer Science and Technology (GJCST)

Application of Self-Organizing Maps in Text Clustering: A Review

Author: Liu Ming
Liu Yuan-Chao
Wang Xiao-Long
Publication venue: 'IntechOpen'
Publication date: 21/11/2012
Field of study

IntechOpen

Clustering techniques for web mining

Author: Qiu Siyuan.
Publication venue
Publication date: 01/01/2012
Field of study

With more and more high-dimensional data becoming prevalent, feature selection has been widely applied in data mining, machine learning and some other fields. The goal of feature selection is removing unneeded features because they might degrade the quality of discovered patterns. As a result, data mining process can be applied much quicker and more accurately. Various feature selection approaches in text categorization have been proposed in the literature. In this project, a Multitype Features Coselection for Web Document Clustering (MFCC) approach has been researched and implemented. MFCC is designed to improve identifying the most discriminative and remove the noisy features. In this project, other than the implementation of MFCC, we have also done the data processing which transforms the raw web documents to the format that can be used in MFCC JAVA program. Afterwards, several simulations have been conducted to test the accuracy and efficiency of MFCC.Bachelor of Engineerin

DR-NTU (Digital Repository of NTU)