6,742 research outputs found

    Adaptive firefly algorithm for hierarchical text clustering

    Get PDF
    Text clustering is essentially used by search engines to increase the recall and precision in information retrieval. As search engine operates on Internet content that is constantly being updated, there is a need for a clustering algorithm that offers automatic grouping of items without prior knowledge on the collection. Existing clustering methods have problems in determining optimal number of clusters and producing compact clusters. In this research, an adaptive hierarchical text clustering algorithm is proposed based on Firefly Algorithm. The proposed Adaptive Firefly Algorithm (AFA) consists of three components: document clustering, cluster refining, and cluster merging. The first component introduces Weight-based Firefly Algorithm (WFA) that automatically identifies initial centers and their clusters for any given text collection. In order to refine the obtained clusters, a second algorithm, termed as Weight-based Firefly Algorithm with Relocate (WFAR), is proposed. Such an approach allows the relocation of a pre-assigned document into a newly created cluster. The third component, Weight-based Firefly Algorithm with Relocate and Merging (WFARM), aims to reduce the number of produced clusters by merging nonpure clusters into the pure ones. Experiments were conducted to compare the proposed algorithms against seven existing methods. The percentage of success in obtaining optimal number of clusters by AFA is 100% with purity and f-measure of 83% higher than the benchmarked methods. As for entropy measure, the AFA produced the lowest value (0.78) when compared to existing methods. The result indicates that Adaptive Firefly Algorithm can produce compact clusters. This research contributes to the text mining domain as hierarchical text clustering facilitates the indexing of documents and information retrieval processes

    Applying Clustering Techniques in Hybrid Network in the Presence of 2D and 3D Obstacles

    Get PDF
    Clustering spatial data is a well-known problem that has been extensively studied. In the real world, there are many physical obstacles such as rivers, lakes, highways, and mountains, whose presence may substantially affect the clustering result. Although many methods have been proposed in previous works, very few have considered physical obstacles and interlinking bridges. Taking these constraints into account during the clustering process is costly, yet modeling the constraints is paramount for good performance. Owing to saturation in existing telephone networks and the ever increasing demand for wire and wireless services, telecommunication engineers are looking at technologies that can deliver sites and satisfy the demand and level of service constraints in an area with and without obstacles. In this paper, we study the problem of clustering in the presence of obstacles to solve the network planning problem. As such, we modified the NetPlan algorithm and developed the COD-NETPLAN (Clustering with Obstructed Distance -- Network Planning) algorithm to solve the problem of 2D and 3D obstacles. We studied the problem of determining the location of the multi-service access node in an area with many mountains and rivers. We used a reachability matrix to detect 2D obstacles, and line segment intersection together with geographical information system techniques for 3D obstacles. Experimental results and the subsequent analysis indicate that the COD-NETPLAN algorithm is both efficient and effective

    Scaling Ant Colony Optimization with Hierarchical Reinforcement Learning Partitioning

    Get PDF
    This paper merges hierarchical reinforcement learning (HRL) with ant colony optimization (ACO) to produce a HRL ACO algorithm capable of generating solutions for large domains. This paper describes two specific implementations of the new algorithm: the first a modification to Dietterich’s MAXQ-Q HRL algorithm, the second a hierarchical ant colony system algorithm. These implementations generate faster results, with little to no significant change in the quality of solutions for the tested problem domains. The application of ACO to the MAXQ-Q algorithm replaces the reinforcement learning, Q-learning, with the modified ant colony optimization method, Ant-Q. This algorithm, MAXQ-AntQ, converges to solutions not significantly different from MAXQ-Q in 88% of the time. This paper then transfers HRL techniques to the ACO domain and traveling salesman problem (TSP). To apply HRL to ACO, a hierarchy must be created for the TSP. A data clustering algorithm creates these subtasks, with an ACO algorithm to solve the individual and complete problems. This paper tests two clustering algorithms, k-means and G-means. The results demonstrate the algorithm with data clustering produces solutions 20 times faster with 5-10% decrease in solution quality due to the effects of clustering

    Scaling Ant Colony Optimization with Hierarchical Reinforcement Learning Partitioning

    Get PDF
    This research merges the hierarchical reinforcement learning (HRL) domain and the ant colony optimization (ACO) domain. The merger produces a HRL ACO algorithm capable of generating solutions for both domains. This research also provides two specific implementations of the new algorithm: the first a modification to Dietterich\u27s MAXQ-Q HRL algorithm, the second a hierarchical ACO algorithm. These implementations generate faster results, with little to no significant change in the quality of solutions for the tested problem domains. The application of ACO to the MAXQ-Q algorithm replaces the reinforcement learning, Q-learning and SARSA, with the modified ant colony optimization method, Ant-Q. This algorithm, MAXQ-AntQ, converges to solutions not significantly different from MAXQ-Q in 88% of the time. This research then transfers HRL techniques to the ACO domain and traveling salesman problem (TSP). To apply HRL to ACO, a hierarchy must be created for the TSP. A data clustering algorithm creates these subtasks, with an ACO algorithm to solve the individual and complete problems. This research tests two clustering algorithms, k-means and G-means. The results demonstrate the algorithm with data clustering produces solutions 85-95% faster but with 5-10% decrease in solution quality
    • …
    corecore