12,818 research outputs found
High Performance Computing of Gene Regulatory Networks using a Message-Passing Model
Gene regulatory network reconstruction is a fundamental problem in
computational biology. We recently developed an algorithm, called PANDA
(Passing Attributes Between Networks for Data Assimilation), that integrates
multiple sources of 'omics data and estimates regulatory network models. This
approach was initially implemented in the C++ programming language and has
since been applied to a number of biological systems. In our current research
we are beginning to expand the algorithm to incorporate larger and most diverse
data-sets, to reconstruct networks that contain increasing numbers of elements,
and to build not only single network models, but sets of networks. In order to
accomplish these "Big Data" applications, it has become critical that we
increase the computational efficiency of the PANDA implementation. In this
paper we show how to recast PANDA's similarity equations as matrix operations.
This allows us to implement a highly readable version of the algorithm using
the MATLAB/Octave programming language. We find that the resulting M-code much
shorter (103 compared to 1128 lines) and more easily modifiable for potential
future applications. The new implementation also runs significantly faster,
with increasing efficiency as the network models increase in size. Tests
comparing the C-code and M-code versions of PANDA demonstrate that this
speed-up is on the order of 20-80 times faster for networks of similar
dimensions to those we find in current biological applications
Mixing multi-core CPUs and GPUs for scientific simulation software
Recent technological and economic developments have led to widespread availability of
multi-core CPUs and specialist accelerator processors such as graphical processing units
(GPUs). The accelerated computational performance possible from these devices can be very
high for some applications paradigms. Software languages and systems such as NVIDIA's
CUDA and Khronos consortium's open compute language (OpenCL) support a number of
individual parallel application programming paradigms. To scale up the performance of some
complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and
very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica-
tions using threading approaches and multi-core CPUs to control independent GPU devices.
We present speed-up data and discuss multi-threading software issues for the applications
level programmer and o er some suggested areas for language development and integration
between coarse-grained and ne-grained multi-thread systems. We discuss results from three
common simulation algorithmic areas including: partial di erential equations; graph cluster
metric calculations and random number generation. We report on programming experiences
and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs;
a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and
trends in multi-core programming for scienti c applications developers
An Evaluation of the X10 Programming Language
As predicted by Moore\u27s law, the number of transistors on a chip has been doubled approximately every two years. As miraculous as it sounds, for many years, the extra transistors have massively benefited the whole computer industry, by using the extra transistors to increase CPU clock speed, thus boosting performance. However, due to heat wall and power constraints, the clock speed cannot be increased limitlessly. Hardware vendors now have to take another path other than increasing clock speed, which is to utilize the transistors to increase the number of processor cores on each chip. This hardware structural change presents inevitable challenges to software structure, where single thread targeted software will not benefit from newer chips or may even suffer from lower clock speed. The two fundamental challenges are: 1. How to deal with the stagnation of single core clock speed and cache memory. 2. How to utilize the additional power generated from more cores on a chip. Most software programming languages nowadays have distributed computing support, such as C and Java [1]. Meanwhile, some new programming languages were invented from scratch just to take advantage of the more distributed hardware structures. The X10 Programming Language is one of them. The goal of this project is to evaluate X10 in terms of performance, programmability and tool support
- …