31 research outputs found
P systems simulations on massively parallel architectures
Membrane Computing is an emergent research area studying
the behaviour of living cells to de ne bio-inspired computing
devices, also called P systems. Such devices provide
polynomial time solutions to NP-complete problems by
trading time for space. The e cient simulation of P systems
poses challenges in three di erent aspects: an intrinsic
massively parallelism of P systems, an exponential computational
workspace, and a non-intensive
oating point nature.
In this paper, we analyze the simulation of a family of recognizer
P systems with active membranes that solves the Satis
ability (SAT) problem in linear time on three di erent architectures:
a shared memory system, a distributed memory
system, and a set of Graphics Processing Units (GPUs). For
an e cient handling of the exponential workspace created by
the P systems computation, we enable di erent data policies
on those architectures to increase memory bandwidth
and exploit data locality through tiling. Parallelism inherent
to the target P system is also managed on each architecture
to demonstrate that GPUs o er a valid alternative for
high-performance computing at a considerably lower cost:
Considering the largest problem size we were able to run
on the three parallel platforms involving four processors,
execution times were 20049.70 ms. using OpenMP on the
shared memory multiprocessor, 4954.03 ms. using MPI on
the distributed memory multiprocessor and 565.56 ms. using
CUDA in our four GPUs, which results in speed factors of
35.44x and 8.75x, respectively.Fundación Séneca 00001/CS/2007Ministerio de Ciencia e Innovación TIN2009–13192European Community CSD2006- 00046Junta de Andalucía P06-TIC-02109Junta de Andalucía P08–TIC-0420
The GPU on the simulation of cellular computing models
Membrane Computing is a discipline aiming to
abstract formal computing models, called membrane systems
or P systems, from the structure and functioning of the living
cells as well as from the cooperation of cells in tissues,
organs, and other higher order structures. This framework
provides polynomial time solutions to NP-complete problems
by trading space for time, and whose efficient simulation
poses challenges in three different aspects: an intrinsic
massively parallelism of P systems, an exponential computational
workspace, and a non-intensive floating point nature.
In this paper, we analyze the simulation of a family of recognizer
P systems with active membranes that solves the
Satisfiability problem in linear time on different instances of
Graphics Processing Units (GPUs). For an efficient handling
of the exponential workspace created by the P systems
computation, we enable different data policies to increase
memory bandwidth and exploit data locality through tiling
and dynamic queues. Parallelism inherent to the target P
system is also managed to demonstrate that GPUs offer a
valid alternative for high-performance computing at a considerably
lower cost. Furthermore, scalability is demonstrated
on the way to the largest problem size we were able to
run, and considering the new hardware generation from
Nvidia, Fermi, for a total speed-up exceeding four orders of
magnitude when running our simulations on the Tesla S2050
server.Agencia Regional de Ciencia y Tecnología - Murcia 00001/CS/2007Ministerio de Ciencia e Innovación TIN2009–13192Ministerio de Ciencia e Innovación TIN2009-14475-C04European Commission Consolider Ingenio-2010 CSD2006-0004
Accelerating fibre orientation estimation from diffusion weighted magnetic resonance imaging using GPUs
With the performance of central processing units (CPUs) having effectively reached a limit, parallel processing offers an alternative for applications with high computational demands. Modern graphics processing units (GPUs) are massively parallel processors that can execute simultaneously thousands of light-weight processes. In this study, we propose and implement a parallel GPU-based design of a popular method that is used for the analysis of brain magnetic resonance imaging (MRI). More specifically, we are concerned with a model-based approach for extracting tissue structural information from diffusion-weighted (DW) MRI data. DW-MRI offers, through tractography approaches, the only way to study brain structural connectivity, non-invasively and in-vivo. We parallelise the Bayesian inference framework for the ball & stick model, as it is implemented in the tractography toolbox of the popular FSL software package (University of Oxford). For our implementation, we utilise the Compute Unified Device Architecture (CUDA) programming model. We show that the parameter estimation, performed through Markov Chain Monte Carlo (MCMC), is accelerated by at least two orders of magnitude, when comparing a single GPU with the respective sequential single-core CPU version. We also illustrate similar speed-up factors (up to 120x) when comparing a multi-GPU with a multi-CPU implementation
Implementing P Systems Parallelism by Means of GPUs
Software development for Membrane Computing is growing
up yielding new applications. Nowadays, the efficiency of P systems simulators
have become a critical point when working with instances of large
size. The newest generation of GPUs (Graphics Processing Units) provide
a massively parallel framework to compute general purpose computations.
We present GPUs as an alternative to obtain better performance
in the simulation of P systems and we illustrate it by giving a solution
to the N-Queens problem as an example.Ministerio de Educación y Ciencia TIN2006-13425Junta de Andalucía P08–TIC-0420
Simulating a P system based efficient solution to SAT by using GPUs
P systems are inherently parallel and non-deterministic theoretical computing devices defined inside the field of Membrane Computing. Many P system simulators have been presented in this area, but they are inefficient since they cannot handle the parallelism of these devices. Nowadays, we are witnessing the consolidation of the GPUs as a parallel framework to compute general purpose applications. In this paper, we analyse GPUs as an alternative parallel architecture to improve the performance in the simulation of P systems, and we illustrate it by using the case study of a family of P systems that provides an efficient and uniform solution to the SAT problem. Firstly, we develop a simulator that fully simulates the computation of the P system, demonstrating that GPUs are well suited to simulate them. Then, we adapt this simulator to the GPU architecture idiosyncrasies, improving the performance of the previous simulator.Ministerio de Ciencia e Innovación TIN2009–13192Junta de Andalucía P08–TIC-0420
Simulation of P systems with active membranes on CUDA
P systems or Membrane Systems provide a high-level computational modelling framework that
combines the structure and dynamic aspects of biological systems in a relevant and understandable way.
They are inherently parallel and non-deterministic computing devices. In this article, we discuss the
motivation, design principles and key of the implementation of a simulator for the class of recognizer P
systems with active membranes running on a (GPU). We compare our parallel simulator for GPUs to the
simulator developed for a single central processing unit (CPU), showing that GPUs are better suited than
CPUs to simulate P systems due to their highly parallel nature.Ministerio de Educación y Ciencia TIN2006-13425Junta de Andalucía P08–TIC-0420
A massively parallel framework using P systems and GPUs
Since CUDA programing model appeared on the
general purpose computations, the developers can extract all
the power contained in GPUs (Graphics Processing Unit) across
many computational domains. Among these domains, P systems
or membrane systems provide a high level computational modeling
framework that allows, in theory, to obtain polynomial
time solutions to NP-complete problems by trading time for
space, and also to model biological phenomena in the area of
computational systems biology. P systems are massively parallel
distributed devices and their computation can be divided in two
levels of parallelism: membranes, that can be expressed as blocks
in CUDA programming model; and objects, that can be expressed
as threads in CUDA programming model. In this paper, we
present our initial ideas of developing a simulator for the class of
recognizer P systems with active membranes by using the CUDA
programing model to exploit the massively parallel nature of
those systems at maximum. Experimental results of a preliminary
version of our simulator on a Tesla C1060 GPU show a 60X of
speed-up compared to the sequential code.Ministerio de Educación y Ciencia TIN2006-13425Junta de Andalucía P08–TIC-0420
Improving drug discovery using a neural networks based parallel scoring function
Virtual Screening (VS) methods can considerably aid clinical research, predicting how ligands interact with drug targets. Most VS methods suppose a unique binding site for the target, but it has been demonstrated that diverse ligands interact with unrelated parts of the target and many VS methods do not take into account this relevant fact. This problem is circumvented by a novel VS methodology named BINDSURF that scans the whole protein surface to find new hotspots, where ligands might potentially interact with, and which is implemented in massively parallel Graphics Processing Units, allowing fast processing of large ligand databases. BINDSURF can thus be used in drug discovery, drug design, drug repurposing and therefore helps considerably in clinical research. However, the accuracy of most VS methods is constrained by limitations in the scoring function that describes biomolecular interactions, and even nowadays these uncertainties are not completely understood. In order to solve this problem, we propose a novel approach where neural networks are trained with databases of known active (drugs) and inactive compounds, and later used to improve VS predictions.This work has been jointly supported by the Fundación Séneca (Agencia Regional de Ciencia y Tecnología de la Región de Murcia) under grant 15290/PI/2010, by the Spanish MINECO and the European Commission FEDER funds under grants TIN2009-14475-C04 and TIN2012-31345, and by the Catholic University of Murcia (UCAM) under grant PMAFI/26/12. This work was partially supported by the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA-CIEMAT), funded by the European Regional Development Fund (ERDF). CETA-CIEMAT belongs to CIEMAT and the Government of Spain
Simulation of Recognizer P Systems by Using Manycore GPUs
Software development for cellular computing is growing up yielding new
applications. In this paper, we describe a simulator for the class of recognizer P systems
with active membranes, which exploits the massively parallel nature of the P systems
computations by using a massively parallel computer architecture, such as Compute
Unified Device Architecture (CUDA) from Nvidia, to obtain better performance in the
simulations. We illustrate it by giving a solution to the N-Queens problem as an example.Ministerio de Educación y Ciencia TIN2006–13425Junta de Andalucía P08–TIC0420
Analysis of P systems simulation on CUDA
GPUs (Graphics Processing Unit) have been con-
solidated as a massively data-parallel coprocessor to
develop many general purpose computations, and en-
able developers to utilize several levels of parallelism
to obtain better performance of their applications.
The massively parallel nature of certain computa-
tions leads to use GPUs as an underlying architec-
ture, becoming a good alternative to other paral-
lel approaches. P systems or membrane systems
are theoretical devices inspired in the way that liv-
ing cells work, providing computational models and
a high level computational modeling framework for
biological systems. They are massively parallel dis-
tributed, and non-deterministic systems. In this pa-
per, we evaluate the GPU as the underlying archi-
tecture to simulate the class of recognizer P systems
with active membranes. We analyze the performance
of three simulators implemented on CPU, CPU-GPU
and GPU respectively. We compare them using a pre-
sented P system as a benchmark, showing that the
GPU is better suited than the CPU to simulate those
P systems due to its massively parallel nature.Ministerio de Educación y Ciencia TIN2006-13425Junta de Andalucía P08–TIC0420