101,048 research outputs found

    Googling DNA sequences on the World Wide Web

    Get PDF
    Background: New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional webbased tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. Results: We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Conclusion: Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web

    A Comparative Study of Artificial Neural Network and Genetic Algorithm in Search Engine Optimization

    Get PDF
    Search engine optimization applies search principles in search engines to assign a higher ranking to the most suitable webpage.  Nowadays, information searching is done ubiquitously on the World Wide Web with the help of search engines. However, the process needs to be efficient and produces accurate results at the same time. In this research, the objectives are to implement and evaluate the Artificial Neural Network and Genetic Algorithms. The accuracy result for both algorithms is compared by implementing keyword ranking, Search Engine Result Page visibility and time retrieval for document-based and e-commerce websites. To achieve them, firstly the problem and data are defined. Next, two datasets are imported from Kaggle and transformed into a more useful format. Then, the Artificial Neural Network and Genetic Algorithms are implemented on these datasets in Python using Jupyter Notebook tools. Subsequently, the accuracy of keyword ranking, Search Engine Result Page visibility and time retrieval for these datasets are observed based on the output and graph displayed. Lastly, an analysis of the results is performed. Conclusively, the Genetic Algorithm demonstrates a higher percentage of accuracy results than Artificial Neural Network algorithm in keyword ranking and SERP visibility. However, the accuracy results of time retrieval are vice versa. The results in Genetic Algorithm shows 9.0%, 9.0% and 3.0% in e-commerce dataset for keyword ranking and 4.0%, 51.0% and 1.0% in document-based dataset for SERP visibility. Next, Artificial Neural Network algorithm shows result 8.0%, 7.0% and 7.0% in e-commerce dataset and 3.0%, 50.0% and 4.0% in document-based dataset for time retrieval. Therefore, the results validated the ability of the Genetic Algorithm as one of the most applied algorithms in the search engine optimization field

    An Approach for Design Search Engine Architecture for Document Summarization

    Get PDF
    Query focused multi document summarization is an emerging area of research. A lot of work has already been done on the subject and a lot more is going on. The following document outlines the effort done by us in this particular field. This work proposes an approach to address automatic Multi Document text summarization in response to a query given by a user. For the explosion of information in the World Wide Web, this work proposed a new method of query-focused multi-documents summarization using genetic algorithm, search engine are used to extract relevant documents and genetic algorithm is used to extract the sentences to form a summary, and it is based on a fitness function formed by three factors: query-focused feature, importance feature, and non-redundancy feature. Experimental result shows that the proposed summarization method can improve the performance of summary, genetic algorithm is efficient. We have developed a very powerful search engine one. On the same note, it also has a great potential for growth. It can be easily applied for systems with not only a few documents but for very large systems with a large number of documents

    Bio-Inspired Hybrid Algorithm for Web Services Clustering

    Get PDF
    Web services clustering is the task of extracting and selecting the features from a collection of Web services and forming groups of closely related services. The implementation of novel and efficient algorithms for Web services clustering is relevant for the organization of service repositories on the Web. Counting with well-organized collections of Web services promotes the efficiency of Web service discovery, search, selection, substitution, and invocation. In recent years, methods inspired by nature using biological analogies have been adapted for clustering problems, among which genetic algorithms, evolutionary strategies, and algorithms that imitate the behavior of some animal species have been implemented. Computation inspired by nature aims at imitating the steps that nature has developed and adapting them to find a solution of a given problem. In this chapter, we investigate how biologically inspired clustering methods can be applied to clustering Web services and present a hybrid approach for Web services clustering using the Artificial Bee Colony (ABC) algorithm, K-means, and Consensus. This hybrid algorithm was implemented, and a series of experiments were conducted using three collections of Web services. Results of the experiments show that the solution approach is adequate and efficient to carry out the clustering of very large collections of Web services

    Genetic Algorithm for Web Data Mining

    Get PDF
    The use of various search engines could influence the number of search results in the World Wide Web. Therefore, this study attempted to discover any association between the word types or the information types used to search through the World Wide Web using the available search engines. By doing so, it could assist the process of data mining for information in the World Wide Web. This study used a prototype program based on genetic algorithm to manipulate the initial set of data. Three sets of inputs were used to generate new populations based on the individual fitness. New strains of individuals from a new population were used to test the results obtained from the World Wide Web. Eight search engines used for this study were tested with two groups of words. All the eight words were used as keyword search in all the eight search engines, and the numbers of web pages returned by each search engines were collected. The total web pages based on the selected new individuals were calculated and tabulated. In order to find any association between the search word and the search engines combinations, the individuals were ranked based on the most web pages to the least according to each of the eight words. Results obtained through the creation of new populations by the prototype program showed that the average fitness of each population improves as new populations were created and new strains of individuals were created through this evolution process. The test on results obtained from the Internet showed that certain class of words could be associated by certain combination of search engines

    A service oriented architecture for engineering design

    Get PDF
    Decision making in engineering design can be effectively addressed by using genetic algorithms to solve multi-objective problems. These multi-objective genetic algorithms (MOGAs) are well suited to implementation in a Service Oriented Architecture. Often the evaluation process of the MOGA is compute-intensive due to the use of a complex computer model to represent the real-world system. The emerging paradigm of Grid Computing offers a potential solution to the compute-intensive nature of this objective function evaluation, by allowing access to large amounts of compute resources in a distributed manner. This paper presents a grid-enabled framework for multi-objective optimisation using genetic algorithms (MOGA-G) to aid decision making in engineering design

    Novel optimization schemes for service composition in the cloud using learning automata-based matrix factorization

    Get PDF
    A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of PhilosophyService Oriented Computing (SOC) provides a framework for the realization of loosely couple service oriented applications (SOA). Web services are central to the concept of SOC. They possess several benefits which are useful to SOA e.g. encapsulation, loose coupling and reusability. Using web services, an application can embed its functionalities within the business process of other applications. This is made possible through web service composition. Web services are composed to provide more complex functions for a service consumer in the form of a value added composite service. Currently, research into how web services can be composed to yield QoS (Quality of Service) optimal composite service has gathered significant attention. However, the number and services has risen thereby increasing the number of possible service combinations and also amplifying the impact of network on composite service performance. QoS-based service composition in the cloud addresses two important sub-problems; Prediction of network performance between web service nodes in the cloud, and QoS-based web service composition. We model the former problem as a prediction problem while the later problem is modelled as an NP-Hard optimization problem due to its complex, constrained and multi-objective nature. This thesis contributed to the prediction problem by presenting a novel learning automata-based non-negative matrix factorization algorithm (LANMF) for estimating end-to-end network latency of a composition in the cloud. LANMF encodes each web service node as an automaton which allows v it to estimate its network coordinate in such a way that prediction error is minimized. Experiments indicate that LANMF is more accurate than current approaches. The thesis also contributed to the QoS-based service composition problem by proposing four evolutionary algorithms; a network-aware genetic algorithm (INSGA), a K-mean based genetic algorithm (KNSGA), a multi-population particle swarm optimization algorithm (NMPSO), and a non-dominated sort fruit fly algorithm (NFOA). The algorithms adopt different evolutionary strategies coupled with LANMF method to search for low latency and QoSoptimal solutions. They also employ a unique constraint handling method used to penalize solutions that violate user specified QoS constraints. Experiments demonstrate the efficiency and scalability of the algorithms in a large scale environment. Also the algorithms outperform other evolutionary algorithms in terms of optimality and calability. In addition, the thesis contributed to QoS-based web service composition in a dynamic environment. This is motivated by the ineffectiveness of the four proposed algorithms in a dynamically hanging QoS environment such as a real world scenario. Hence, we propose a new cellular automata-based genetic algorithm (CellGA) to address the issue. Experimental results show the effectiveness of CellGA in solving QoS-based service composition in dynamic QoS environment
    corecore