Search CORE

366 research outputs found

Exploring Pattern Mining Algorithms for Hashtag Retrieval Problem

Author: Belhadi Asma
Cano Alberto
Djenouri Youcef
Lin Jerry Chun-Wei
Zhang Chongsheng
Publication venue: VCU Scholars Compass
Publication date: 01/01/2020
Field of study

Hashtag is an iconic feature to retrieve the hot topics of discussion on Twitter or other social networks. This paper incorporates the pattern mining approaches to improve the accuracy of retrieving the relevant information and speeding up the search performance. A novel algorithm called PM-HR (Pattern Mining for Hashtag Retrieval) is designed to first transform the set of tweets into a transactional database by considering two different strategies (trivial and temporal). After that, the set of the relevant patterns is discovered, and then used as a knowledge-based system for finding the relevant tweets based on users\u27 queries under the similarity search process. Extensive results are carried out on large and different tweet collections, and the proposed PM-HR outperforms the baseline hashtag retrieval approaches in terms of runtime, and it is very competitive in terms of accuracy

SINTEF Open

VCU Scholars Compass

NORA - Norwegian Open Research Archives

Exploring Pattern Mining Algorithms for Hashtag Retrieval Problem

Author: Alberto Cano
Asma Belhadi
Chongsheng Zhang
Djenouri Youcef
Jerry Chun-Wei Lin
Publication venue: IEEE
Publication date: 01/01/2020
Field of study

Hashtag is an iconic feature to retrieve the hot topics of discussion on Twitter or other social networks. This paper incorporates the pattern mining approaches to improve the accuracy of retrieving the relevant information and speeding up the search performance. A novel algorithm called PM-HR (Pattern Mining for Hashtag Retrieval) is designed to first transform the set of tweets into a transactional database by considering two different strategies (trivial and temporal). After that, the set of the relevant patterns is discovered, and then used as a knowledge-based system for finding the relevant tweets based on users' queries under the similarity search process. Extensive results are carried out on large and different tweet collections, and the proposed PM-HR outperforms the baseline hashtag retrieval approaches in terms of runtime, and it is very competitive in terms of accuracy.publishedVersio

SINTEF Open

A genetic algorithm coupled with tree-based pruning for mining closed association rules

Author: Acharya Udupi Dinesh
Subba Reddy Nandanvana Veerappareddy
Suresh Ponmudiyan Poovan Jashma
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/06/2023
Field of study

Due to the voluminous amount of itemsets that are generated, the association rules extracted from these itemsets contain redundancy, and designing an effective approach to address this issue is of paramount importance. Although multiple algorithms were proposed in recent years for mining closed association rules most of them underperform in terms of run time or memory. Another issue that remains challenging is the nature of the dataset. While some of the existing algorithms perform well on dense datasets others perform well on sparse datasets. This paper aims to handle these drawbacks by using a genetic algorithm for mining closed association rules. Recent studies have shown that genetic algorithms perform better than conventional algorithms due to their bitwise operations of crossover and mutation. Bitwise operations are predominantly faster than conventional approaches and bits consume lesser memory thereby improving the overall performance of the algorithm. To address the redundancy in the mined association rules a tree-based pruning algorithm has been designed here. This works on the principle of minimal antecedent and maximal consequent. Experiments have shown that the proposed approach works well on both dense and sparse datasets while surpassing existing techniques with regard to run time and memory

ZENODO

Institute of Advanced Engineering and Science

A Survey on Index Support for Item Set Mining

Author: Dr.P K Singhal
Senthil Prakash.T
Publication venue: Global Journals Inc. (US)
Publication date: 30/06/2011
Field of study

It is very difficult to handle the huge amount of information stored in modern databases. To manage with these databases association rule mining is currently used, which is a costly process that involves a significant amount of time and memory. Therefore, it is necessary to develop an approach to overcome these difficulties. A suitable data structures and algorithms must be developed to effectively perform the item set mining. An index includes all necessary characteristics potentially needed during the mining task; the extraction can be executed with the help of the index, without accessing the database. A database index is a data structure that enhances the speed of information retrieval operations on a database table at very low cost and increased storage space. The use index permits user interaction, in which the user can specify different attributes for item set extraction. Therefore, the extraction can be completed with the use index and without accessing the original database. Index also supports for reusing concept to mine item sets with the use of any support threshold. This paper also focuses on the survey of index support for item set mining which are proposed by various authors

Global Journal of Computer Science and Technology (GJCST)

Mining Frequent Item sets in Data Streams

Author: Dass Rajanish
Publication venue
Publication date
Field of study

Research Papers in Economics

An efficient parallel method for mining frequent closed sequential patterns

Author: Huynh Bao
Snášel Václav
Vo Bay
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Mining frequent closed sequential pattern (FCSPs) has attracted a great deal of research attention, because it is an important task in sequences mining. In recently, many studies have focused on mining frequent closed sequential patterns because, such patterns have proved to be more efficient and compact than frequent sequential patterns. Information can be fully extracted from frequent closed sequential patterns. In this paper, we propose an efficient parallel approach called parallel dynamic bit vector frequent closed sequential patterns (pDBV-FCSP) using multi-core processor architecture for mining FCSPs from large databases. The pDBV-FCSP divides the search space to reduce the required storage space and performs closure checking of prefix sequences early to reduce execution time for mining frequent closed sequential patterns. This approach overcomes the problems of parallel mining such as overhead of communication, synchronization, and data replication. It also solves the load balance issues of the workload between the processors with a dynamic mechanism that re-distributes the work, when some processes are out of work to minimize the idle CPU time.Web of Science5174021739

Crossref

DSpace at VSB Technical University of Ostrava

Pinpoint Clustering of Web Pages and Mining Implicit Crossover Concepts

Author: Makoto Haraguchi
Yoshiaki Okubo
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen

Crossref