774 research outputs found

    A SURVEY ON ANT COLONY OPTIMIZATION ALGORITHM

    Get PDF
    A novel Ant Colony Optimization algorithm (ACO) combined for the hierarchical multi- label classification problem of protein function prediction. This kind of problem is mainly focused on biometric area, given the large increase in the number of uncharacterized proteins available for analysis and the importance of determining their functions in order to improve the current biological knowledge. Because it is known that a protein can perform more than one function and many protein functional-definition schemes are organized in a hierarchical structure, the classification problem in this case is an instance of a hierarchical multi-label problem. In this classification method, each class might have multiple class labels and class labels are represented in a hierarchical structure—either a tree or a directed acyclic graph (DAG) structure. A more difficult problem than conventional flat classification in this approach, given that the classification algorithm has to take into account hierarchical relationships between class labels and be able to predict multiple class labels for the same example. The proposed ACO algorithm discovers an ordered list of hierarchical multi-label classification rules

    Attribute Selection Algorithm with Clustering based Optimization Approach based on Mean and Similarity Distance

    Get PDF
    With hundreds or thousands of attributes in high-dimensional data, the computational workload is challenging. Attributes that have no meaningful influence on class predictions throughout the classification process increase the computing load. This article's goal is to use attribute selection to reduce the size of high-dimensional data, which will lessen the computational load. Considering selected attribute subsets that cover all attributes. As a result, there are two stages to the process: filtering out superfluous information and settling on a single attribute to stand in for a group of similar but otherwise meaningless characteristics. Numerous studies on attribute selection, including backward and forward selection, have been undertaken. This experiment and the accuracy of the categorization result recommend a k-means based PSO clustering-based attribute selection. It is likely that related attributes are present in the same cluster while irrelevant attributes are not identified in any clusters. Datasets for Credit Approval, Ionosphere, Annealing, Madelon, Isolet, and Multiple Attributes are employed alongside two other high-dimensional datasets. Both databases include the class label for each data point. Our test demonstrates that attribute selection using k-means clustering may be done to offer a subset of characteristics and that doing so produces classification outcomes that are more accurate than 80%

    User centered neuro-fuzzy energy management through semantic-based optimization

    Get PDF
    This paper presents a cloud-based building energy management system, underpinned by semantic middleware, that integrates an enhanced sensor network with advanced analytics, accessible through an intuitive Web-based user interface. The proposed solution is described in terms of its three key layers: 1) user interface; 2) intelligence; and 3) interoperability. The system’s intelligence is derived from simulation-based optimized rules, historical sensor data mining, and a fuzzy reasoner. The solution enables interoperability through a semantic knowledge base, which also contributes intelligence through reasoning and inference abilities, and which are enhanced through intelligent rules. Finally, building energy performance monitoring is delivered alongside optimized rule suggestions and a negotiation process in a 3-D Web-based interface using WebGL. The solution has been validated in a real pilot building to illustrate the strength of the approach, where it has shown over 25% energy savings. The relevance of this paper in the field is discussed, and it is argued that the proposed solution is mature enough for testing across further buildings

    Adaptive firefly algorithm for hierarchical text clustering

    Get PDF
    Text clustering is essentially used by search engines to increase the recall and precision in information retrieval. As search engine operates on Internet content that is constantly being updated, there is a need for a clustering algorithm that offers automatic grouping of items without prior knowledge on the collection. Existing clustering methods have problems in determining optimal number of clusters and producing compact clusters. In this research, an adaptive hierarchical text clustering algorithm is proposed based on Firefly Algorithm. The proposed Adaptive Firefly Algorithm (AFA) consists of three components: document clustering, cluster refining, and cluster merging. The first component introduces Weight-based Firefly Algorithm (WFA) that automatically identifies initial centers and their clusters for any given text collection. In order to refine the obtained clusters, a second algorithm, termed as Weight-based Firefly Algorithm with Relocate (WFAR), is proposed. Such an approach allows the relocation of a pre-assigned document into a newly created cluster. The third component, Weight-based Firefly Algorithm with Relocate and Merging (WFARM), aims to reduce the number of produced clusters by merging nonpure clusters into the pure ones. Experiments were conducted to compare the proposed algorithms against seven existing methods. The percentage of success in obtaining optimal number of clusters by AFA is 100% with purity and f-measure of 83% higher than the benchmarked methods. As for entropy measure, the AFA produced the lowest value (0.78) when compared to existing methods. The result indicates that Adaptive Firefly Algorithm can produce compact clusters. This research contributes to the text mining domain as hierarchical text clustering facilitates the indexing of documents and information retrieval processes

    QAPgrid: A Two Level QAP-Based Approach for Large-Scale Data Analysis and Visualization

    Get PDF
    Background: The visualization of large volumes of data is a computationally challenging task that often promises rewarding new insights. There is great potential in the application of new algorithms and models from combinatorial optimisation. Datasets often contain “hidden regularities” and a combined identification and visualization method should reveal these structures and present them in a way that helps analysis. While several methodologies exist, including those that use non-linear optimization algorithms, severe limitations exist even when working with only a few hundred objects. Methodology/Principal Findings: We present a new data visualization approach (QAPgrid) that reveals patterns of similarities and differences in large datasets of objects for which a similarity measure can be computed. Objects are assigned to positions on an underlying square grid in a two-dimensional space. We use the Quadratic Assignment Problem (QAP) as a mathematical model to provide an objective function for assignment of objects to positions on the grid. We employ a Memetic Algorithm (a powerful metaheuristic) to tackle the large instances of this NP-hard combinatorial optimization problem, and we show its performance on the visualization of real data sets. Conclusions/Significance: Overall, the results show that QAPgrid algorithm is able to produce a layout that represents the relationships between objects in the data set. Furthermore, it also represents the relationships between clusters that are feed into the algorithm. We apply the QAPgrid on the 84 Indo-European languages instance, producing a near-optimal layout. Next, we produce a layout of 470 world universities with an observed high degree of correlation with the score used by the Academic Ranking of World Universities compiled in the The Shanghai Jiao Tong University Academic Ranking of World Universities without the need of an ad hoc weighting of attributes. Finally, our Gene Ontology-based study on Saccharomyces cerevisiae fully demonstrates the scalability and precision of our method as a novel alternative tool for functional genomics

    The Dynamic Load Balancing Method On Game Theory For Distributed Systems

    Get PDF
    The load balancing model is aimed at the public cloud which has several nodes with scattered computing resources in different geographic locations. When the environment is very large and difficult these divisions simplify the load balancing. The cloud has a main controller that chooses the suitable partitions for arriving jobs while the balancer for each cloud partition chooses the best load balancing strategy. Thus this model divides the public cloud into several cloud partitions. Static schemes do not use the system information and are fewer compounds while dynamic schemes will bring additional costs for the system but can change as the system status changes. The model has a main controller and balancers to gather and analyse the information

    A PARTIAL REPLICATION LOAD BALANCING TECHNIQUE FOR DISTRIBUTED DATA AS A SERVICE ON THE CLOUD

    Get PDF
    Data as a service (DaaS) is an important model on the Cloud, as DaaS provides clients with different types of large files and data sets in fields like finance, science, health, geography, astronomy, and many others. This includes all types of files with varying sizes from a few kilobytes to hundreds of terabytes. DaaS can be implemented and provided using multiple data centers located at different locations and usually connected via the Internet. When data is provided using multiple data centers it is referred to as distributed DaaS. DaaS providers must ensure that their services are fast, reliable, and efficient. However, ensuring these requirements needs to be done while considering the cost associated and will be carried by the DaaS provider and most likely by the users as well. One traditional approach to support a large number of clients is to replicate the services on different servers. However, this requires full replication of all stored data sets, which requires a huge amount of storage. The huge storage consumption will result in increased costs. Therefore, the aim of this research is to provide a fast, efficient distributed DaaS for the clients, while reducing the storage consumption on the Cloud servers used by the DaaS providers. The method I utilize in this research for fast distributed DaaS is the collaborative dual-direction download of a file or dataset partitions from multiple servers to the client, which will enhance the speed of the download process significantly. Moreover, I partially replicate the file partitions among Cloud servers using the previous download experiences I obtain for each partition. As a result, I generate partial sections of the data sets that will collectively be smaller than the total size needed if full replicas are stored on each server. My method is self-managed; and operates only when more storage is needed. I evaluated my approach against other existing approaches and demonstrated that it provides an important enhancement to current approaches in both download performance and storage consumption. I also developed and analyzed the mathematical model supporting my approach and validated its accuracy

    Heuristic-Based Ant Colony Optimization Algorithm For Protein Functional Module Detection In Protein Interaction Network

    Get PDF
    Ant colony optimization (ACO) is a metaheuristic algorithm that has been successfully applied to several types of optimization problems such as scheduling, routing, and more recently for solving protein functional module detection (PFMD) problem in protein-protein interaction (PPI) networks. For a small PPI data size, ACO has been successfully applied to but it is not suitable for large and noisy PPI data, which has caused to premature convergence and stagnation in the searching process. To cope with the aforementioned limitations, we propose two new enhancements of ACO to solve PFMD problem
    corecore