Search CORE

10,875 research outputs found

Enhanced parallel Differential Evolution algorithm for problems in computational systems biology

Author: Banga Julio R.
Doallo Ramón
González Patricia
Penas David R.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

[Abstract] Many key problems in computational systems biology and bioinformatics can be formulated and solved using a global optimization framework. The complexity of the underlying mathematical models require the use of efficient solvers in order to obtain satisfactory results in reasonable computation times. Metaheuristics are gaining recognition in this context, with Differential Evolution (DE) as one of the most popular methods. However, for most realistic applications, like those considering parameter estimation in dynamic models, DE still requires excessive computation times. Here we consider this latter class of problems and present several enhancements to DE based on the introduction of additional algorithmic steps and the exploitation of parallelism. In particular, we propose an asynchronous parallel implementation of DE which has been extended with improved heuristics to exploit the specific structure of parameter estimation problems in computational systems biology. The proposed method is evaluated with different types of benchmarks problems: (i) black-box global optimization problems and (ii) calibration of non-linear dynamic models of biological systems, obtaining excellent results both in terms of quality of the solution and regarding speedup and scalability.Ministerio de Economía y Competitividad; DPI2011-28112-C04-03Consejo Superior de Investigaciones Científicas; PIE-201170E018Ministerio de Ciencia e Innovación; TIN2013-42148-PGalicia. Consellería de Cultura, Educación e Ordenación Universitaria; GRC2013/05

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Digital.CSIC

Towards cloud-based parallel metaheuristics: A case study in computational biology with Differential Evolution and Spark

Author: Banga Julio R.
Doallo Ramón
González Patricia
Pardo Xoán C.
Teijeiro Diego
Publication venue: 'SAGE Publications'
Publication date: 28/11/2016
Field of study

[Abstract] Many key problems in science and engineering can be formulated and solved using global optimization techniques. In the particular case of computational biology, the development of dynamic (kinetic) models is one of the current key issues. In this context, the problem of parameter estimation (model calibration) remains as a very challenging task. The complexity of the underlying models requires the use of efficient solvers to achieve adequate results in reasonable computation times. Metaheuristics have been the focus of great consideration as an efficient way of solving hard global optimization problems. Even so, in most realistic applications, metaheuristics require a very large computation time to obtain an acceptable result. Therefore, several parallel schemes have been proposed, most of them focused on traditional parallel programming interfaces and infrastructures. However, with the emergence of cloud computing, new programming models have been proposed to deal with large-scale data processing on clouds. In this paper we explore the applicability of these new models for global optimization problems using as a case study a set of challenging parameter estimation problems in systems biology. We have developed, using Spark, an island-based parallel version of Differential Evolution. Differential Evolution is a simple population-based metaheuristic that, at the same time, is very popular for being very efficient in real function global optimization. Several experiments were conducted both on a cluster and on the Microsoft Azure public cloud to evaluate the speedup and efficiency of the proposal, concluding that the Spark implementation achieves not only competitive speedup against the serial implementation, but also good scalability when the number of nodes grows. The results can be useful for those interested in using parallel metaheuristics for global optimization problems benefiting from the potential of new cloud programming models.Ministerio de Economía y Competitividad and FEDER; through the Project SYNBIOFACTORY; DPI2014-55276-C5-2-RMinisterio de Economía y Competitividad and FEDER; TIN2013-42148-PMinisterio de Economía y Competitividad and FEDER; TIN2016-75845-PXunta de Galicia; R2014/04

Repositorio da Universidade da Coruña

Recommended from our members

Imaging of a fluid injection process using geophysical data - A didactic example

Author: Commer M
Finsterle S
Kowalsky MB
Pride SR
Vasco DW
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

In many subsurface industrial applications, fluids are injected into or withdrawn from a geologic formation. It is of practical interest to quantify precisely where, when, and by how much the injected fluid alters the state of the subsurface. Routine geophysical monitoring of such processes attempts to image the way that geophysical properties, such as seismic velocities or electrical conductivity, change through time and space and to then make qualitative inferences as to where the injected fluid has migrated. The more rigorous formulation of the time-lapse geophysical inverse problem forecasts how the subsurface evolves during the course of a fluid-injection application. Using time-lapse geophysical signals as the data to be matched, the model unknowns to be estimated are the multiphysics forward-modeling parameters controlling the fluid-injection process. Properly reproducing the geophysical signature of the flow process, subsequent simulations can predict the fluid migration and alteration in the subsurface. The dynamic nature of fluid-injection processes renders imaging problems more complex than conventional geophysical imaging for static targets. This work intents to clarify the related hydrogeophysical parameter estimation concepts

eScholarship - University of California

A Bayesian Consistent Dual Ensemble Kalman Filter for State-Parameter Estimation in Subsurface Hydrology

Author: Ait-El-Fquih Boujemaa
Gharamti Mohamad El
Hoteit Ibrahim
Publication venue
Publication date: 04/11/2015
Field of study

Ensemble Kalman filtering (EnKF) is an efficient approach to addressing uncertainties in subsurface groundwater models. The EnKF sequentially integrates field data into simulation models to obtain a better characterization of the model's state and parameters. These are generally estimated following joint and dual filtering strategies, in which, at each assimilation cycle, a forecast step by the model is followed by an update step with incoming observations. The Joint-EnKF directly updates the augmented state-parameter vector while the Dual-EnKF employs two separate filters, first estimating the parameters and then estimating the state based on the updated parameters. In this paper, we reverse the order of the forecast-update steps following the one-step-ahead (OSA) smoothing formulation of the Bayesian filtering problem, based on which we propose a new dual EnKF scheme, the Dual-EnKF

_{\rm OSA}

. Compared to the Dual-EnKF, this introduces a new update step to the state in a fully consistent Bayesian framework, which is shown to enhance the performance of the dual filtering approach without any significant increase in the computational cost. Numerical experiments are conducted with a two-dimensional synthetic groundwater aquifer model to assess the performance and robustness of the proposed Dual-EnKF

_{\rm OSA}

, and to evaluate its results against those of the Joint- and Dual-EnKFs. The proposed scheme is able to successfully recover both the hydraulic head and the aquifer conductivity, further providing reliable estimates of their uncertainties. Compared with the standard Joint- and Dual-EnKFs, the proposed scheme is found more robust to different assimilation settings, such as the spatial and temporal distribution of the observations, and the level of noise in the data. Based on our experimental setups, it yields up to 25% more accurate state and parameters estimates

arXiv.org e-Print Archive

Directory of Open Access Journals

Using the Cloud for Parameter Estimation Problems: Comparing Spark vs MPI with a Case-Study

Author: Banga Julio R.
Doallo Ramón
González Patricia
Pardo Xoán C.
Penas David R.
Teijeiro Diego
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/07/2017
Field of study

Date of Conference: 14-17 May 2017. Conference Location: Madrid[Abstract] Systems biology is an emerging approach focused in generating new knowledge about complex biological systems by combining experimental data with mathematical modeling and advanced computational techniques. Many problems in this field are extremely challenging and require substantial supercomputing resources to be solved. This is the case of parameter estimation in large-scale nonlinear dynamic systems biology models. Recently, Cloud Computing has emerged as a new paradigm for on-demand delivery of computing resources. However, scientific computing community has been quite hesitant in using the Cloud, simply because traditional programming models do not fit well with the new paradigm, and the earliest cloud programming models do not allow most scientific computations being efficiently run in the Cloud. In this paper we explore and compare two distributed computing models: the MPI (message-passing interface) model, that is high-performance oriented, and the Spark model, which is throughput oriented but outperforms other cloud programming solutions adding improved support for iterative algorithms through in-memory computing. The performance of a very well known metaheuristic, the Differential Evolution algorithm, has been thoroughly assessed using a challenging parameter estimation problem from the domain of computational systems biology. The experiments have been carried out both in a local cluster and in the Microsoft Azure public cloud, allowing performance and cost evaluation for both infrastructures.Gobierno de España; DPI2014-55276-C5-2-RFondos Feder; TIN2016-75845-PXunta de Galicia; R2016/045Xunta de Galicia; GRC2013/05

Repositorio da Universidade da Coruña

Crossref

Recommended from our members

Nanometer VLSI placement and optimization for multi-objective design closure

Author: Luo Tao, Ph. D.
Publication venue
Publication date: 01/12/2007
Field of study

In a VLSI physical synthesis flow, placement directly defines the interconnection, which affects many other design objectives, such as timing, power consumption, congestion, and thermal issues. With the scaling of technology, the relative interconnect delay increases dramatically. As a result, placement has become a bottleneck in deep sub-micron physical synthesis. In this dissertation, I propose several optimization algorithms from global placement, placement migration, timing driven placements, to incremental power optimizations for multi-objective VLSI design closure. The first work is DPlace, a new global placement algorithm that scales well to the modern large-scale circuit placement problems. DPlace simulates the natural diffusion process to spread cells smoothly over the placement region, and uses both analytical and discrete techniques to improve the wire length. However, global placement is never sufficient for multi-objective design closure, a variety of design objectives have to be improved incrementally, such as timing, routing congestion, signal integrity, and heat distribution. Placement migration is a critical step to address the cell overlaps appearing during incremental optimizations. To achieve high placement stability, I propose a computational geometry based placement migration flow to cope with placement changes, and a new stability metric to measure the “similarity” between two placements accurately. Our placement migration algorithm has clear advantage over conventional legalization algorithms such that the neighborhood characteristics of the original placement are preserved. For timing closure in high performance designs, I present a linear programming based incremental timing driven placement to improve the timing on critical paths directly. I further present an efficient timing driven placement algorithm (Pyramids). Two formulations of Pyramids are proposed, which are suitable for different optimization stages in a physical synthesis flow. Both approaches find the optimal location for timing of a cell in constant time, through computational geometry based approaches. For fast convergence of design closure, placement should be integrated with other optimization techniques. I propose to combine placement, gate sizing and Vt swapping techniques to reduce the total power consumption, especially the leakage power, which is becoming increasingly critical for nanometer VLSI design closure.Electrical and Computer Engineerin

Texas ScholarWorks

A cloud-based enhanced differential evolution algorithm for parameter estimation problems in computational systems biology

Author: A Gábor
A Villaverde
C Deng
C Moles
David R. Penas
Diego Teijeiro
E Alba
E Schneider
G Cedersund
H Link
J Apolloni
J Banga
J Brest
J Locke
J Sun
JA Egea
JE Dennis Jr
JR Karr
Julio R. Banga
K Smallbone
K Tagawa
L Weihmann
M Ashyraliyev
M Crepinšek
M Ruciński
M Srinivas
M Weber
M Weber
N Noman
N Novère Le
Patricia González
R Storn
Ramón Doallo
S Das
S Das
SZ Zhao
T Lipniacki
Xoán C. Pardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

This is a post-peer-review, pre-copyedit version of an article published in Cluster Computing. The final authenticated version is available online at: https://doi.org/10.1007/s10586-017-0860-1[Abstract] Metaheuristics are gaining increasing recognition in many research areas, computational systems biology among them. Recent advances in metaheuristics can be helpful in locating the vicinity of the global solution in reasonable computation times, with Differential Evolution (DE) being one of the most popular methods. However, for most realistic applications, DE still requires excessive computation times. With the advent of Cloud Computing effortless access to large number of distributed resources has become more feasible, and new distributed frameworks, like Spark, have been developed to deal with large scale computations on commodity clusters and cloud resources. In this paper we propose a parallel implementation of an enhanced DE using Spark. The proposal drastically reduces the execution time, by means of including a selected local search and exploiting the available distributed resources. The performance of the proposal has been thoroughly assessed using challenging parameter estimation problems from the domain of computational systems biology. Two different platforms have been used for the evaluation, a local cluster and the Microsoft Azure public cloud. Additionally, it has been also compared with other parallel approaches, another cloud-based solution (a MapReduce implementation) and a traditional HPC solution (a MPI implementation)Ministerio de Economía y Competitividad; DPI2014-55276-C5-2-RMinisterio de Economía y Competitividad; TIN2013-42148-PMinisterio de Economía y Competitividad; TIN2016-75845-PXunta de Galicia ; R2016/045Xunta de Galicia; GRC2013/05

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Digital.CSIC

Seismic Ray Impedance Inversion

Author: Lu Xiaolin
Lu Xiaolin
Publication venue: Earth Science and Engineering, Imperial College London
Publication date: 01/11/2010
Field of study

This thesis investigates a prestack seismic inversion scheme implemented in the ray parameter domain. Conventionally, most prestack seismic inversion methods are performed in the incidence angle domain. However, inversion using the concept of ray impedance, as it honours ray path variation following the elastic parameter variation according to Snell’s law, shows the capacity to discriminate different lithologies if compared to conventional elastic impedance inversion. The procedure starts with data transformation into the ray-parameter domain and then implements the ray impedance inversion along constant ray-parameter profiles. With different constant-ray-parameter profiles, mixed-phase wavelets are initially estimated based on the high-order statistics of the data and further refined after a proper well-to-seismic tie. With the estimated wavelets ready, a Cauchy inversion method is used to invert for seismic reflectivity sequences, aiming at recovering seismic reflectivity sequences for blocky impedance inversion. The impedance inversion from reflectivity sequences adopts a standard generalised linear inversion scheme, whose results are utilised to identify rock properties and facilitate quantitative interpretation. It has also been demonstrated that we can further invert elastic parameters from ray impedance values, without eliminating an extra density term or introducing a Gardner’s relation to absorb this term. Ray impedance inversion is extended to P-S converted waves by introducing the definition of converted-wave ray impedance. This quantity shows some advantages in connecting prestack converted wave data with well logs, if compared with the shearwave elastic impedance derived from the Aki and Richards approximation to the Zoeppritz equations. An analysis of P-P and P-S wave data under the framework of ray impedance is conducted through a real multicomponent dataset, which can reduce the uncertainty in lithology identification.Inversion is the key method in generating those examples throughout the entire thesis as we believe it can render robust solutions to geophysical problems. Apart from the reflectivity sequence, ray impedance and elastic parameter inversion mentioned above, inversion methods are also adopted in transforming the prestack data from the offset domain to the ray-parameter domain, mixed-phase wavelet estimation, as well as the registration of P-P and P-S waves for the joint analysis. The ray impedance inversion methods are successfully applied to different types of datasets. In each individual step to achieving the ray impedance inversion, advantages, disadvantages as well as limitations of the algorithms adopted are detailed. As a conclusion, the ray impedance related analyses demonstrated in this thesis are highly competent compared with the classical elastic impedance methods and the author would like to recommend it for a wider application

Spiral - Imperial College Digital Repository

Architectures and GPU-Based Parallelization for Online Bayesian Computational Statistics and Dynamic Modeling

Author: Duan Lujie
Publication venue: 'University of Saskatchewan Library'
Publication date: 27/09/2021
Field of study

Recent work demonstrates that coupling Bayesian computational statistics methods with dynamic models can facilitate the analysis of complex systems associated with diverse time series, including those involving social and behavioural dynamics. Particle Markov Chain Monte Carlo (PMCMC) methods constitute a particularly powerful class of Bayesian methods combining aspects of batch Markov Chain Monte Carlo (MCMC) and the sequential Monte Carlo method of Particle Filtering (PF). PMCMC can flexibly combine theory-capturing dynamic models with diverse empirical data. Online machine learning is a subcategory of machine learning algorithms characterized by sequential, incremental execution as new data arrives, which can give updated results and predictions with growing sequences of available incoming data. While many machine learning and statistical methods are adapted to online algorithms, PMCMC is one example of the many methods whose compatibility with and adaption to online learning remains unclear. In this thesis, I proposed a data-streaming solution supporting PF and PMCMC methods with dynamic epidemiological models and demonstrated several successful applications. By constructing an automated, easy-to-use streaming system, analytic applications and simulation models gain access to arriving real-time data to shorten the time gap between data and resulting model-supported insight. The well-defined architecture design emerging from the thesis would substantially expand traditional simulation models' potential by allowing such models to be offered as continually updated services. Contingent on sufficiently fast execution time, simulation models within this framework can consume the incoming empirical data in real-time and generate informative predictions on an ongoing basis as new data points arrive. In a second line of work, I investigated the platform's flexibility and capability by extending this system to support the use of a powerful class of PMCMC algorithms with dynamic models while ameliorating such algorithms' traditionally stiff performance limitations. Specifically, this work designed and implemented a GPU-enabled parallel version of a PMCMC method with dynamic simulation models. The resulting codebase readily has enabled researchers to adapt their models to the state-of-art statistical inference methods, and ensure that the computation-heavy PMCMC method can perform significant sampling between the successive arrival of each new data point. Investigating this method's impact with several realistic PMCMC application examples showed that GPU-based acceleration allows for up to 160x speedup compared to a corresponding CPU-based version not exploiting parallelism. The GPU accelerated PMCMC and the streaming processing system can complement each other, jointly providing researchers with a powerful toolset to greatly accelerate learning and securing additional insight from the high-velocity data increasingly prevalent within social and behavioural spheres. The design philosophy applied supported a platform with broad generalizability and potential for ready future extensions. The thesis discusses common barriers and difficulties in designing and implementing such systems and offers solutions to solve or mitigate them

University of Saskatchewan Research Archive