1,087 research outputs found
A survey of data mining techniques for social media analysis
Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors
Feature selection, optimization and clustering strategies of text documents
Clustering is one of the most researched areas of data mining applications in the contemporary literature. The need for efficient clustering is observed across wide sectors including consumer segmentation, categorization, shared filtering, document management, and indexing. The research of clustering task is to be performed prior to its adaptation in the text environment. Conventional approaches typically emphasized on the quantitative information where the selected features are numbers. Efforts also have been put forward for achieving efficient clustering in the context of categorical information where the selected features can assume nominal values. This manuscript presents an in-depth analysis of challenges of clustering in the text environment. Further, this paper also details prominent models proposed for clustering along with the pros and cons of each model. In addition, it also focuses on various latest developments in the clustering task in the social network and associated environments
- …