454 research outputs found

    Adaptive firefly algorithm for hierarchical text clustering

    Get PDF
    Text clustering is essentially used by search engines to increase the recall and precision in information retrieval. As search engine operates on Internet content that is constantly being updated, there is a need for a clustering algorithm that offers automatic grouping of items without prior knowledge on the collection. Existing clustering methods have problems in determining optimal number of clusters and producing compact clusters. In this research, an adaptive hierarchical text clustering algorithm is proposed based on Firefly Algorithm. The proposed Adaptive Firefly Algorithm (AFA) consists of three components: document clustering, cluster refining, and cluster merging. The first component introduces Weight-based Firefly Algorithm (WFA) that automatically identifies initial centers and their clusters for any given text collection. In order to refine the obtained clusters, a second algorithm, termed as Weight-based Firefly Algorithm with Relocate (WFAR), is proposed. Such an approach allows the relocation of a pre-assigned document into a newly created cluster. The third component, Weight-based Firefly Algorithm with Relocate and Merging (WFARM), aims to reduce the number of produced clusters by merging nonpure clusters into the pure ones. Experiments were conducted to compare the proposed algorithms against seven existing methods. The percentage of success in obtaining optimal number of clusters by AFA is 100% with purity and f-measure of 83% higher than the benchmarked methods. As for entropy measure, the AFA produced the lowest value (0.78) when compared to existing methods. The result indicates that Adaptive Firefly Algorithm can produce compact clusters. This research contributes to the text mining domain as hierarchical text clustering facilitates the indexing of documents and information retrieval processes

    A Comprehensive Review of Recent Variants and Modifications of Firefly Algorithm

    Get PDF
    Swarm intelligence (SI) is an emerging field of biologically-inspired artificial intelligence based on the behavioral models of social insects such as ants, bees, wasps, termites etc. Swarm intelligence is the discipline that deals with natural and artificial systems composed of many individuals that coordinate using decentralized control and self-organization. Most SI algorithms have been developed to address stationary optimization problems and hence, they can converge on the (near-) optimum solution efficiently. However, many real-world problems have a dynamic environment that changes over time. In the last two decades, there has been a growing interest of addressing Dynamic Optimization Problems using SI algorithms due to their adaptation capabilities. This paper presents a broad review on two SI algorithms: 1) Firefly Algorithm (FA) 2) Flower Pollination Algorithm (FPA). FA is inspired from bioluminescence characteristic of fireflies. FPA is inspired from the the pollination behavior of flowering plants. This article aims to give a detailed analysis of different variants of FA and FPA developed by parameter adaptations, modification, hybridization as on date. This paper also addresses the applications of these algorithms in various fields. In addition, literatures found that most of the cases that used FA and FPA technique have outperformed compare to other metaheuristic algorithms

    Document clustering based on firefly algorithm

    Get PDF
    Document clustering is widely used in Information Retrieval however, existing clustering techniques suffer from local optima problem in determining the k number of clusters.Various efforts have been put to address such drawback and this includes the utilization of swarm-based algorithms such as particle swarm optimization and Ant Colony Optimization.This study explores the adaptation of another swarm algorithm which is the Firefly Algorithm (FA) in text clustering.We present two variants of FA; Weight- based Firefly Algorithm (WFA) and Weight-based Firefly Algorithm II (WFAII).The difference between the two algorithms is that the WFAII, includes a more restricted condition in determining members of a cluster.The proposed FA methods are later evaluated using the 20Newsgroups dataset.Experimental results on the quality of clustering between the two FA variants are presented and are later compared against the one produced by particle swarm optimization, K-means and the hybrid of FA and -K-means. The obtained results demonstrated that the WFAII outperformed the WFA, PSO, K-means and FA-Kmeans. This result indicates that a better clustering can be obtained once the exploitation of a search solution is improved

    Comparative Analysis of Privacy Preservation Mechanism: Assessing Trustworthy Cloud Services with a Hybrid Framework and Swarm Intelligence

    Get PDF
    Cloud computing has emerged as a prominent field in modern computational technology, offering diverse services and resources. However, it has also raised pressing concerns regarding data privacy and the trustworthiness of cloud service providers. Previous works have grappled with these challenges, but many have fallen short in providing comprehensive solutions. In this context, this research proposes a novel framework designed to address the issues of maintaining data privacy and fostering trust in cloud computing services. The primary objective of this work is to develop a robust and integrated solution that safeguards sensitive data and enhances trust in cloud service providers. The proposed architecture encompasses a series of key components, including data collection and preprocessing with k-anonymity, trust generation using the Firefly Algorithm, Ant Colony Optimization for task scheduling and resource allocation, hybrid framework integration, and privacy-preserving computation. The scientific contribution of this work lies in the integration of multiple optimization techniques, such as the Firefly Algorithm and Ant Colony Optimization, to select reliable cloud service providers while considering trust factors and task/resource allocation. Furthermore, the proposed framework ensures data privacy through k-anonymity compliance, dynamic resource allocation, and privacy-preserving computation techniques such as differential privacy and homomorphic encryption. The outcomes of this research provide a comprehensive solution to the complex challenges of data privacy and trust in cloud computing services. By combining these techniques into a hybrid framework, this work contributes to the advancement of secure and effective cloud-based operations, offering a substantial step forward in addressing the critical issues faced by organizations and individuals in an increasingly interconnected digital landscape

    Improved Firefly Algorithm with Variable Neighborhood Search for Data Clustering

    Get PDF
    من بين الخوارزميات الأدلة العليا (الميتاهيورستك)، تعد الخوارزميات القائمة على البحوث المتعددة (المجتمع) خوارزمية بحث استكشافية متفوقة كخوارزمية البحث المحلية من حيث استكشاف مساحة البحث للعثور على الحلول المثلى العالمية. ومع ذلك، فإن الجانب السلبي الأساسي للخوارزميات القائمة على البحوث المتعددة (المجتمع) هو قدرتها الاستغلالية المنخفضة، مما يمنع توسع منطقة البحث عن الحلول المثلى. خوارزمية اليَرَاعَة المضيئة (Firefly (FA هي خوارزمية تعتمد على المجتمع والتي تم استخدامها على نطاق واسع في مشاكل التجميع. ومع ذلك، فإن FA مقيد بتقاربها السابق لأوانه عندما لا يتم استخدام استراتيجيات بحث محلي لتحسين جودة حلول المجموعات في منطقة المجاورة واستكشاف المناطق العالمية في مساحة البحث. على هذا الأساس، فإن الهدف من هذا العمل هو تحسين FA باستخدام البحث المتغير في الأحياء (VNS) كطريقة بحث محلية (FA-VNS)، وبالتالي توفير فائدة VNS للمفاضلة بين قدرات الاستكشاف والاستغلال. يسمح FA-VNS المقترح لليراعات بتحسين حلول التجميع مع القدرة على تعزيز حلول التجميع والحفاظ على تنوع حلول التجميع أثناء عملية البحث باستخدام مشغلي الاضطراب في VNS. لتقييم أداء الخوارزمية، يتم استخدام ثماني مجموعات بيانات معيارية مع أربع خوارزميات تجميع معروفة. تشير المقارنة وفقًا لمقاييس التقييم الداخلية والخارجية إلى أن FA-VNS المقترحة يمكن أن تنتج حلول تجميع أكثر إحكاما من خوارزميات التجميع المعروفة.Among the metaheuristic algorithms, population-based algorithms are an explorative search algorithm superior to the local search algorithm in terms of exploring the search space to find globally optimal solutions. However, the primary downside of such algorithms is their low exploitative capability, which prevents the expansion of the search space neighborhood for more optimal solutions. The firefly algorithm (FA) is a population-based algorithm that has been widely used in clustering problems. However, FA is limited in terms of its premature convergence when no neighborhood search strategies are employed to improve the quality of clustering solutions in the neighborhood region and exploring the global regions in the search space. On these bases, this work aims to improve FA using variable neighborhood search (VNS) as a local search method, providing VNS the benefit of the trade-off between the exploration and exploitation abilities. The proposed FA-VNS allows fireflies to improve the clustering solutions with the ability to enhance the clustering solutions and maintain the diversity of the clustering solutions during the search process using the perturbation operators of VNS. To evaluate the performance of the algorithm, eight benchmark datasets are utilized with four well-known clustering algorithms. The comparison according to the internal and external evaluation metrics indicates that the proposed FA-VNS can produce more compact clustering solutions than the well-known clustering algorithms

    Churn Identification and Prediction from a Large-Scale Telecommunication Dataset Using NLP

    Get PDF
    The identification of customer churn is a major issue for large telecom businesses. In order to manage the data of current customers as well as acquire and manage new customers, every day, a substantial volume of data gets generated. Therefore, it's crucial to identify the causes of client churn so that the appropriate steps can be taken to lower it. Numerous researchers have already discussed their efforts to combine static and dynamic approaches in order to reduce churn in big data sets, but these systems still have many issues when it comes to actually identifying churn. In this paper, we suggested two methods, the first of which is churn identification and using Natural Language Processing (NLP) methods and machine learning techniques, we make predictions based on a vast telecommunication data set. The NLP process involves data pre-processing, normalization, feature extraction, and feature selection. For feature extraction, we employ unique techniques like TF-IDF, Stanford NLP, and occurrence correlation methods, have been suggested. Throughout the lesson, a machine learning classification algorithm is used for training and testing. Finally, the system employs a variety of cross validation techniques and training and evaluating Machine learning algorithms. The experimental analysis shows the system's efficacy and accuracy

    Advances in Meta-Heuristic Optimization Algorithms in Big Data Text Clustering

    Full text link
    This paper presents a comprehensive survey of the meta-heuristic optimization algorithms on the text clustering applications and highlights its main procedures. These Artificial Intelligence (AI) algorithms are recognized as promising swarm intelligence methods due to their successful ability to solve machine learning problems, especially text clustering problems. This paper reviews all of the relevant literature on meta-heuristic-based text clustering applications, including many variants, such as basic, modified, hybridized, and multi-objective methods. As well, the main procedures of text clustering and critical discussions are given. Hence, this review reports its advantages and disadvantages and recommends potential future research paths. The main keywords that have been considered in this paper are text, clustering, meta-heuristic, optimization, and algorithm

    A Review of the Family of Artificial Fish Swarm Algorithms: Recent Advances and Applications

    Full text link
    The Artificial Fish Swarm Algorithm (AFSA) is inspired by the ecological behaviors of fish schooling in nature, viz., the preying, swarming, following and random behaviors. Owing to a number of salient properties, which include flexibility, fast convergence, and insensitivity to the initial parameter settings, the family of AFSA has emerged as an effective Swarm Intelligence (SI) methodology that has been widely applied to solve real-world optimization problems. Since its introduction in 2002, many improved and hybrid AFSA models have been developed to tackle continuous, binary, and combinatorial optimization problems. This paper aims to present a concise review of the family of AFSA, encompassing the original ASFA and its improvements, continuous, binary, discrete, and hybrid models, as well as the associated applications. A comprehensive survey on the AFSA from its introduction to 2012 can be found in [1]. As such, we focus on a total of {\color{blue}123} articles published in high-quality journals since 2013. We also discuss possible AFSA enhancements and highlight future research directions for the family of AFSA-based models.Comment: 37 pages, 3 figure
    corecore