4,218 research outputs found
Hardware-accelerated parallel genetic algorithm for fitness functions with variable execution times
Genetic Algorithms (GAs) following a parallel master-slave architecture can be effectively used to reduce searching time when fitness functions have fixed execution time. This paper presents a parallel GA architecture along with two accelerated GA operators to enhance the performance of master-slave GAs, specially when considering fitness functions with variable execution times. We explore the performance of the proposed approach, and analyse its effectiveness against the state-of-the-art. The results show a significant improvement in search times and fitness function utilisation, thus potentially enabling the use of this approach as a faster searching tool for timing-sensitive optimisation processes such as those found in dynamic real-time systems
Paraiso : An Automated Tuning Framework for Explicit Solvers of Partial Differential Equations
We propose Paraiso, a domain specific language embedded in functional
programming language Haskell, for automated tuning of explicit solvers of
partial differential equations (PDEs) on GPUs as well as multicore CPUs. In
Paraiso, one can describe PDE solving algorithms succinctly using tensor
equations notation. Hydrodynamic properties, interpolation methods and other
building blocks are described in abstract, modular, re-usable and combinable
forms, which lets us generate versatile solvers from little set of Paraiso
source codes.
We demonstrate Paraiso by implementing a compressive hydrodynamics solver. A
single source code less than 500 lines can be used to generate solvers of
arbitrary dimensions, for both multicore CPUs and GPUs. We demonstrate both
manual annotation based tuning and evolutionary computing based automated
tuning of the program.Comment: 52 pages, 14 figures, accepted for publications in Computational
Science and Discover
Integration through genetic programming on heterogeneous systems.
Nowadays, numerous applications in various scientific fields require the integration of mathematical functions that, due to some of their characteristics, do not have an analytical expression for their antiderivative. These definite integrals are usually solved by numerical integration methods, which provide an approximation of the numerical value of the integral in the integration range. With this type of solutions, a higher precision of the approximation entails a longer computation time, being necessary a trade-off between both aspects. In this work we present a genetic programming algorithm which provides mathematical expressions that approximate the antiderivative of analytically non-integrable functions. Heterogeneous devices, GPU and multicore CPU, have also been used in the development of the system to accelerate the parts suitable for it. The advantage of obtaining these approximate antiderivatives is the reduction of the computation time necessary to calculate the definite integral of the functions of interest, reducing it to simply evaluating the expression at the beginning and the end of the integration range.<br /
JPEG steganography with particle swarm optimization accelerated by AVX
Digital steganography aims at hiding secret messages in digital data transmitted over insecure channels. The JPEG format is prevalent in digital communication, and images are often used as cover objects in digital steganography. Optimization methods can improve the properties of images with embedded secret but introduce additional computational complexity to their processing. AVX instructions available in modern CPUs are, in this work, used to accelerate data parallel operations that are part of image steganography with advanced optimizations.Web of Science328art. no. e544
High performance graph analysis on parallel architectures
PhD ThesisOver the last decade pharmacology has been developing computational
methods to enhance drug development and testing. A computational
method called network pharmacology uses graph analysis
tools to determine protein target sets that can lead on better targeted
drugs for diseases as Cancer. One promising area of network-based
pharmacology is the detection of protein groups that can produce
better e ects if they are targeted together by drugs. However, the
e cient prediction of such protein combinations is still a bottleneck
in the area of computational biology.
The computational burden of the algorithms used by such protein
prediction strategies to characterise the importance of such proteins
consists an additional challenge for the eld of network pharmacology.
Such computationally expensive graph algorithms as the all pairs
shortest path (APSP) computation can a ect the overall drug discovery
process as needed network analysis results cannot be given on
time. An ideal solution for these highly intensive computations could
be the use of super-computing. However, graph algorithms have datadriven
computation dictated by the structure of the graph and this
can lead to low compute capacity utilisation with execution times
dominated by memory latency.
Therefore, this thesis seeks optimised solutions for the real-world
graph problems of critical node detection and e ectiveness characterisation
emerged from the collaboration with a pioneer company in the
eld of network pharmacology as part of a Knowledge Transfer Partnership
(KTP) / Secondment (KTS). In particular, we examine how
genetic algorithms could bene t the prediction of protein complexes
where their removal could produce a more e ective 'druggable' impact.
Furthermore, we investigate how the problem of all pairs shortest
path (APSP) computation can be bene ted by the use of emerging
parallel hardware architectures as GPU- and FPGA- desktop-based
accelerators.
In particular, we address the problem of critical node detection with
the development of a heuristic search method. It is based on a genetic
algorithm that computes optimised node combinations where their removal
causes greater impact than common impact analysis strategies.
Furthermore, we design a general pattern for parallel network analysis
on multi-core architectures that considers graph's embedded properties.
It is a divide and conquer approach that decomposes a graph
into smaller subgraphs based on its strongly connected components
and computes the all pairs shortest paths concurrently on GPU. Furthermore,
we use linear algebra to design an APSP approach based
on the BFS algorithm. We use algebraic expressions to transform the
problem of path computation to multiple independent matrix-vector
multiplications that are executed concurrently on FPGA. Finally, we
analyse how the optimised solutions of perturbation analysis and parallel
graph processing provided in this thesis will impact the drug
discovery process.This research was part of a Knowledge Transfer Partnership (KTP)
and Knowledge Transfer Secondment (KTS) between e-therapeutics
PLC and Newcastle University. It was supported as a collaborative
project by e-therapeutics PLC and Technology Strategy boar
Digital Ecosystems: Ecosystem-Oriented Architectures
We view Digital Ecosystems to be the digital counterparts of biological
ecosystems. Here, we are concerned with the creation of these Digital
Ecosystems, exploiting the self-organising properties of biological ecosystems
to evolve high-level software applications. Therefore, we created the Digital
Ecosystem, a novel optimisation technique inspired by biological ecosystems,
where the optimisation works at two levels: a first optimisation, migration of
agents which are distributed in a decentralised peer-to-peer network, operating
continuously in time; this process feeds a second optimisation based on
evolutionary computing that operates locally on single peers and is aimed at
finding solutions to satisfy locally relevant constraints. The Digital
Ecosystem was then measured experimentally through simulations, with measures
originating from theoretical ecology, evaluating its likeness to biological
ecosystems. This included its responsiveness to requests for applications from
the user base, as a measure of the ecological succession (ecosystem maturity).
Overall, we have advanced the understanding of Digital Ecosystems, creating
Ecosystem-Oriented Architectures where the word ecosystem is more than just a
metaphor.Comment: 39 pages, 26 figures, journa
High-Performance Parallel Implementation of Genetic Algorithm on FPGA
Genetic algorithms (GAs) are used to solve search and optimization problems in which an optimal solution can be found using an iterative process with probabilistic and non-deterministic transitions. However, depending on the problem’s nature, the time required to find a solution can be high in sequential machines due to the computational complexity of genetic algorithms. This work proposes a full-parallel implementation of a genetic algorithm on field-programmable gate array (FPGA). Optimization of the system’s processing time is the main goal of this project. Results associated with the processing time and area occupancy (on FPGA) for various population sizes are analyzed. Studies concerning the accuracy of the GA response for the optimization of two variables functions were also evaluated for the hardware implementation. However, the high-performance implementation proposed in this paper is able to work with more variable from some adjustments on hardware architecture. The results showed that the GA full-parallel implementation achieved throughput about 16 millions of generations per second and speedups between 17 and 170,000 associated with several works proposed in the literature
GPU Implementation of DPSO-RE Algorithm for Parameters Identification of Surface PMSM Considering VSI Nonlinearity
In this paper, an accurate parameter estimation model of surface permanent magnet synchronous machines (SPMSMs) is established by taking into account voltage-source-inverter (VSI) nonlinearity. A fast dynamic particle swarm optimization (DPSO) algorithm combined with a receptor editing (RE) strategy is proposed to explore the optimal values of parameter estimations. This combination provides an accelerated implementation on graphics processing unit (GPU), and the proposed method is, therefore, referred to as G-DPSORE. In G-DPSO-RE, a dynamic labor division strategy is incorporated into the swarms according to the designed evolutionary factor during the evolution process. Two novel modifications of the movement equation are designed to update the velocity of particles. Moreover, a chaotic-logistic-based immune RE operator is developed to facilitate the global best individual (gBest particle) to explore a potentially better region. Furthermore, a GPU parallel acceleration technique is utilized to speed up parameter estimation procedure. It has been demonstrated that the proposed method is effective for simultaneous estimation of the PMSM parameters and the disturbance voltage (Vdead) due to VSI nonlinearity from experimental data for currents and rotor speed measured with inexpensive equipment. The influence of the VSI nonlinearity on the accuracy of parameter estimation is analyzed
- …