7,415 research outputs found

    A cloud-based enhanced differential evolution algorithm for parameter estimation problems in computational systems biology

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Cluster Computing. The final authenticated version is available online at: https://doi.org/10.1007/s10586-017-0860-1[Abstract] Metaheuristics are gaining increasing recognition in many research areas, computational systems biology among them. Recent advances in metaheuristics can be helpful in locating the vicinity of the global solution in reasonable computation times, with Differential Evolution (DE) being one of the most popular methods. However, for most realistic applications, DE still requires excessive computation times. With the advent of Cloud Computing effortless access to large number of distributed resources has become more feasible, and new distributed frameworks, like Spark, have been developed to deal with large scale computations on commodity clusters and cloud resources. In this paper we propose a parallel implementation of an enhanced DE using Spark. The proposal drastically reduces the execution time, by means of including a selected local search and exploiting the available distributed resources. The performance of the proposal has been thoroughly assessed using challenging parameter estimation problems from the domain of computational systems biology. Two different platforms have been used for the evaluation, a local cluster and the Microsoft Azure public cloud. Additionally, it has been also compared with other parallel approaches, another cloud-based solution (a MapReduce implementation) and a traditional HPC solution (a MPI implementation)Ministerio de Economía y Competitividad; DPI2014-55276-C5-2-RMinisterio de Economía y Competitividad; TIN2013-42148-PMinisterio de Economía y Competitividad; TIN2016-75845-PXunta de Galicia ; R2016/045Xunta de Galicia; GRC2013/05

    Multimethod optimization in the cloud: A case‐study in systems biology modelling

    Get PDF
    [Abstract] Optimization problems appear in many different applications in science and engineering. A large number of different algorithms have been proposed for solving them; however, there is no unique general optimization method that performs efficiently across a diverse set of problems. Thus, a multimethod optimization, in which different algorithms cooperate to outperform the results obtained by any of them in isolation, is a very appealing alternative. Besides, as real‐life optimization problems are becoming more and more challenging, the use of HPC techniques to implement these algorithms represents an effective strategy to speed up the time‐to‐solution. In addition, a parallel multimethod approach can benefit from the effortless access to q large number of distributed resources facilitated by cloud computing. In this paper, we propose a self‐adaptive cooperative parallel multimethod for global optimization. This proposal aims to perform a thorough exploration of the solution space by means of multiple concurrent executions of a broad range of search strategies. For its evaluation, we consider an extremely challenging case‐study from the field of computational systems biology. We also assess the performance of the proposal on a public cloud, demonstrating both the potential of the multimethod approach and the opportunity that the cloud provides for these problems.Gobierno de España; DPI2014‐55276‐C5‐2‐RGobierno de España; DPI2017‐82896‐C2‐2‐RGobierno de España; TIN2016‐75845‐PXunta de Galicia; R2016/045Xunta de Galicia; ED431C 2017/0

    Parallel ant colony optimization for the training of cell signaling networks

    Get PDF
    [Abstract]: Acquiring a functional comprehension of the deregulation of cell signaling networks in disease allows progress in the development of new therapies and drugs. Computational models are becoming increasingly popular as a systematic tool to analyze the functioning of complex biochemical networks, such as those involved in cell signaling. CellNOpt is a framework to build predictive logic-based models of signaling pathways by training a prior knowledge network to biochemical data obtained from perturbation experiments. This training can be formulated as an optimization problem that can be solved using metaheuristics. However, the genetic algorithm used so far in CellNOpt presents limitations in terms of execution time and quality of solutions when applied to large instances. Thus, in order to overcome those issues, in this paper we propose the use of a method based on ant colony optimization, adapted to the problem at hand and parallelized using a hybrid approach. The performance of this novel method is illustrated with several challenging benchmark problems in the study of new therapies for liver cancer

    A parallel metaheuristic for large mixed-integer dynamic optimization problems, with applications in computational biology

    Get PDF
    [Abstract] Background: We consider a general class of global optimization problems dealing with nonlinear dynamic models. Although this class is relevant to many areas of science and engineering, here we are interested in applying this framework to the reverse engineering problem in computational systems biology, which yields very large mixed-integer dynamic optimization (MIDO) problems. In particular, we consider the framework of logic-based ordinary differential equations (ODEs). Methods: We present saCeSS2, a parallel method for the solution of this class of problems. This method is based on an parallel cooperative scatter search metaheuristic, with new mechanisms of self-adaptation and specific extensions to handle large mixed-integer problems. We have paid special attention to the avoidance of convergence stagnation using adaptive cooperation strategies tailored to this class of problems. Results: We illustrate its performance with a set of three very challenging case studies from the domain of dynamic modelling of cell signaling. The simpler case study considers a synthetic signaling pathway and has 84 continuous and 34 binary decision variables. A second case study considers the dynamic modeling of signaling in liver cancer using high-throughput data, and has 135 continuous and 109 binaries decision variables. The third case study is an extremely difficult problem related with breast cancer, involving 690 continuous and 138 binary decision variables. We report computational results obtained in different infrastructures, including a local cluster, a large supercomputer and a public cloud platform. Interestingly, the results show how the cooperation of individual parallel searches modifies the systemic properties of the sequential algorithm, achieving superlinear speedups compared to an individual search (e.g. speedups of 15 with 10 cores), and significantly improving (above a 60%) the performance with respect to a non-cooperative parallel scheme. The scalability of the method is also good (tests were performed using up to 300 cores). Conclusions: These results demonstrate that saCeSS2 can be used to successfully reverse engineer large dynamic models of complex biological pathways. Further, these results open up new possibilities for other MIDO-based large-scale applications in the life sciences such as metabolic engineering, synthetic biology, drug scheduling.Ministerio de Economía y Competitividad; DPI2014-55276-C5-2-RMinisterio de Economía y Competitividad; TIN2016-75845-PGalicia. Consellería de Cultura, Educación e Ordenación Universitaria; R2016/045Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; GRC2013/05

    Implementing Parallel Differential Evolution on Spark

    Get PDF
    [Abstract] Metaheuristics are gaining increased attention as an efficient way of solving hard global optimization problems. Differential Evolution (DE) is one of the most popular algorithms in that class. However, its application to realistic problems results in excessive computation times. Therefore, several parallel DE schemes have been proposed, most of them focused on traditional parallel programming interfaces and infrastruc- tures. However, with the emergence of Cloud Computing, new program- ming models, like Spark, have appeared to suit with large-scale data processing on clouds. In this paper we investigate the applicability of Spark to develop parallel DE schemes to be executed in a distributed environment. Both the master-slave and the island-based DE schemes usually found in the literature have been implemented using Spark. The speedup and efficiency of all the implementations were evaluated on the Amazon Web Services (AWS) public cloud, concluding that the island- based solution is the best suited to the distributed nature of Spark. It achieves a good speedup versus the serial implementation, and shows a decent scalability when the number of nodes grows.[Resumen] Las metaheurísticas están recibiendo una atención creciente como técnica eficiente en la resolución de problemas difíciles de optimización global. Differential Evolution (DE) es una de las metaheurísticas más populares, sin embargo su aplicación en problemas reales deriva en tiempos de cómputo excesivos. Por ello se han realizado diferentes propuestas para la paralelización del DE, en su mayoría utilizando infraestructuras e interfaces de programación paralela tradicionales. Con la aparición de la computación en la nube también se han propuesto nuevos modelos de programación, como Spark, que permiten manejar el procesamiento de datos a gran escala en la nube. En este artículo investigamos la aplicabilidad de Spark en el desarrollo de implementaciones paralelas del DE para su ejecución en entornos distribuidos. Se han implementado tanto la aproximación master-slave como la basada en islas, que son las más comunes. También se han evaluado la aceleración y la eficiencia de todas las implementaciones usando el cloud público de Amazon (AWS, Amazon Web Services), concluyéndose que la implementación basada en islas es la más adecuada para el esquema de distribución usado por Spark. Esta implementación obtiene una buena aceleración en relación a la implementación serie y muestra una escalabilidad bastante buena cuando el número de nodos aumenta.[Resume] As metaheurísticas están recibindo unha atención a cada vez maior como técnica eficiente na resolución de problemas difíciles de optimización global. Differential Evolution (DE) é unha das metaheurísticas mais populares, ainda que a sua aplicación a problemas reais deriva en tempos de cómputo excesivos. É por iso que se propuxeron diferentes esquemas para a paralelización do DE, na sua maioría utilizando infraestruturas e interfaces de programación paralela tradicionais. Coa aparición da computación na nube tamén se propuxeron novos modelos de programación, como Spark, que permiten manexar o procesamento de datos a grande escala na nube. Neste artigo investigamos a aplicabilidade de Spark no desenvolvimento de implementacións paralelas do DE para a sua execución en contornas distribuidas. Implementáronse tanto a aproximación master-slave como a baseada en illas, que son as mais comúns. Tamén se avaliaron a aceleración e a eficiencia de todas as implementacións usando o cloud público de Amazon (AWS, Amazon Web Services), tirando como conclusión que a implementación baseada en illas é a mais acaida para o esquema de distribución usado por Spark. Esta implementación obtén unha boa aceleración en relación á implementación serie e amosa unha escalabilidade bastante boa cando o número de nos aumenta.Ministerio de Economía y Competitividad; DPI2014-55276-C5-2-RXunta de Galicia; GRC2013/055Xunta de Galicia; R2014/04

    Hybrid parallel multimethod hyperheuristic for mixed-integer dynamic optimization problems in computational systems biology

    Get PDF
    [Abstract] This paper describes and assesses a parallel multimethod hyperheuristic for the solution of complex global optimization problems. In a multimethod hyperheuristic, different metaheuristics cooperate to outperform the results obtained by any of them isolated. The results obtained show that the cooperation of individual parallel searches modifies the systemic properties of the hyperheuristic, achieving significant performance improvements versus the sequential and the non-cooperative parallel solutions. Here we present and evaluate a hybrid parallel scheme of the multimethod, using both message-passing (MPI) and shared memory (OpenMP) models. The hybrid parallelization allows to achieve a better trade-off between performance and computational resources, through a compromise between diversity (number of islands) and intensity (number of threads per island). For the performance evaluation, we considered the general problem of reverse engineering nonlinear dynamic models in systems biology, which yields very large mixed-integer dynamic optimization problems. In particular, three very challenging problems from the domain of dynamic modeling of cell signaling were used as case studies. In addition, experiments have been carried out in a local cluster, a large supercomputer and a public cloud, to show the suitability of the proposed solution in different execution platforms.Gobierno de España; DPI2017-82896-C2-2-RGobierno de España; TIN2016-75845-PXunta de Galicia; R2016/045Xunta de Galicia; ED431C 2017/0

    A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications

    Get PDF
    Particle swarm optimization (PSO) is a heuristic global optimization method, proposed originally by Kennedy and Eberhart in 1995. It is now one of the most commonly used optimization techniques. This survey presented a comprehensive investigation of PSO. On one hand, we provided advances with PSO, including its modifications (including quantum-behaved PSO, bare-bones PSO, chaotic PSO, and fuzzy PSO), population topology (as fully connected, von Neumann, ring, star, random, etc.), hybridization (with genetic algorithm, simulated annealing, Tabu search, artificial immune system, ant colony algorithm, artificial bee colony, differential evolution, harmonic search, and biogeography-based optimization), extensions (to multiobjective, constrained, discrete, and binary optimization), theoretical analysis (parameter selection and tuning, and convergence analysis), and parallel implementation (in multicore, multiprocessor, GPU, and cloud computing forms). On the other hand, we offered a survey on applications of PSO to the following eight fields: electrical and electronic engineering, automation control systems, communication theory, operations research, mechanical engineering, fuel and energy, medicine, chemistry, and biology. It is hoped that this survey would be beneficial for the researchers studying PSO algorithms

    Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy

    Get PDF
    [Abstract] Background The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. Results The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. Conclusions The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models.Ministerio de Economía y Competitividad; DPI2011-28112-C04-03Ministerio de Economía y Competitividad; DPI2011-28112-C04-04Ministerio de Economía y Competitividad; DPI2014-55276-C5-2-RMinisterio de Economía y Competitividad; TIN2013-42148-PMinisterio de Economía y Competitividad; TIN2016-75845-PGalicia. Consellería de Cultura, Educación e Ordenación Universitaria; R2014/041Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; R2016/045Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; GRC2013/05

    MOLNs: A cloud platform for interactive, reproducible and scalable spatial stochastic computational experiments in systems biology using PyURDME

    Full text link
    Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools, a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments
    corecore