Search CORE

91 research outputs found

OpenCL Actors - Adding Data Parallelism to Actor-based Programming with CAF

Author: A Klöckner
D Charousset
G Agha
G Agha
J Nickolls
JD Owens
K Wu
L Dagum
S Srinivasan
S Wienke
T Desell
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The actor model of computation has been designed for a seamless support of concurrency and distribution. However, it remains unspecific about data parallel program flows, while available processing power of modern many core hardware such as graphics processing units (GPUs) or coprocessors increases the relevance of data parallelism for general-purpose computation. In this work, we introduce OpenCL-enabled actors to the C++ Actor Framework (CAF). This offers a high level interface for accessing any OpenCL device without leaving the actor paradigm. The new type of actor is integrated into the runtime environment of CAF and gives rise to transparent message passing in distributed systems on heterogeneous hardware. Following the actor logic in CAF, OpenCL kernels can be composed while encapsulated in C++ actors, hence operate in a multi-stage fashion on data resident at the GPU. Developers are thus enabled to build complex data parallel programs from primitives without leaving the actor paradigm, nor sacrificing performance. Our evaluations on commodity GPUs, an Nvidia TESLA, and an Intel PHI reveal the expected linear scaling behavior when offloading larger workloads. For sub-second duties, the efficiency of offloading was found to largely differ between devices. Moreover, our findings indicate a negligible overhead over programming with the native OpenCL API.Comment: 28 page

arXiv.org e-Print Archive

Crossref

REPOSIT

Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs

Author: A Hart
D Dutykh
G Ruetsch
I Karlin
IZ Reguly
J Gong
J Nickolls
JE Stone
M Martineau
M Norman
MB Giles
S Wienke
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Efficiently exploiting GPUs is increasingly essential in scientific computing, as many current and upcoming supercomputers are built using them. To facilitate this, there are a number of programming approaches, such as CUDA, OpenACC and OpenMP 4, supporting different programming languages (mainly C/C++ and Fortran). There are also several compiler suites (clang, nvcc, PGI, XL) each supporting different combinations of languages. In this study, we take a detailed look at some of the currently available options, and carry out a comprehensive analysis and comparison using computational loops and applications from the domain of unstructured mesh computations. Beyond runtimes and performance metrics (GB/s), we explore factors that influence performance such as register counts, occupancy, usage of different memory types, instruction counts, and algorithmic differences. Results of this work show how clang's CUDA compiler frequently outperform NVIDIA's nvcc, performance issues with directive-based approaches on complex kernels, and OpenMP 4 support maturing in clang and XL; currently around 10% slower than CUDA

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

Repository of the Academy's Library

GPU-Based Data Processing for 2-D Microwave Imaging on MAST

Author: BITTNER R.
CASTRO R.
DAVIS W. M.
EIDIETIS N. W.
FREETHY S. J.
FREETHY S. J.
GARELLI N.
HUANG B. K.
LUJAN P.
MONTEIRO E.
NAVARRO C. A.
NAYLOR G. A.
NICKOLLS J.
OWENS J. D.
PELL O.
SALMON N. A.
SHEVCHENKO V. F.
SHEVCHENKO V. F.
THOMAS D. A.
THOUTI K.
URBAN J.
VAN CITTERT P. H.
VERMIJ E.
WYNTERS E.
XU C.
YANG L.
YUE X.
ZERNIKE F.
Publication venue: 'American Nuclear Society'
Publication date: 07/04/2016
Field of study

The Synthetic Aperture Microwave Imaging (SAMI) diagnostic is a Mega Amp Spherical Tokamak (MAST) diagnostic based at Culham Centre for Fusion Energy. The acceleration of the SAMI diagnostic data-processing code by a graphics processing unit is presented, demonstrating acceleration of up to 60 times compared to the original IDL (Interactive Data Language) data-processing code. SAMI will now be capable of intershot processing allowing pseudo-real-time control so that adjustments and optimizations can be made between shots. Additionally, for the first time the analysis of many shots will be possible

Durham Research Online

Crossref

White Rose Research Online

Accelerated large-scale multiple sequence alignment

Author: A Szalkowski
A Wilm
A Wirawan
AV Bhatt
C Grasso
C Notredame
D Mikhailov
DF Feng
E Eskin
G Tan
GM Amdahl
H Carroll
H Vandierendonck
I Letunic
J Cheetham
J Ebedes
J Nickolls
JD Thompson
JD Thompson
JD Thompson
K Katoh
KB Li
M Farrar
M Feldman
M Friedman
OpenMP
Quinn O Snell
RC Edgar
S Lloyd
S Washietl
Scott Lloyd
SR Eddy
T Lassmann
T Oliver
T Ramdas
T Wang
X Deng
X Lin
Y Li
Y Liu
Y Liu
Publication venue: BioMed Central
Publication date: 01/12/2011
Field of study

Abstract Background Multiple sequence alignment (MSA) is a fundamental analysis method used in bioinformatics and many comparative genomic applications. Prior MSA acceleration attempts with reconfigurable computing have only addressed the first stage of progressive alignment and consequently exhibit performance limitations according to Amdahl's Law. This work is the first known to accelerate the third stage of progressive alignment on reconfigurable hardware. Results We reduce subgroups of aligned sequences into discrete profiles before they are pairwise aligned on the accelerator. Using an FPGA accelerator, an overall speedup of up to 150 has been demonstrated on a large data set when compared to a 2.4 GHz Core2 processor. Conclusions Our parallel algorithm and architecture accelerates large-scale MSA with reconfigurable computing and allows researchers to solve the larger problems that confront biologists today. Program source is available from <url>http://dna.cs.byu.edu/msa/</url>.</p

Crossref

Directory of Open Access Journals

PubMed Central

Vibration-induced extra torque during electrically-evoked contractions of the human calf muscles

Abstract Background High-frequency trains of electrical stimulation applied over the lower limb muscles can generate forces higher than would be expected from a peripheral mechanism (i.e. by direct activation of motor axons). This phenomenon is presumably originated within the central nervous system by synaptic input from Ia afferents to motoneurons and is consistent with the development of plateau potentials. The first objective of this work was to investigate if vibration (sinusoidal or random) applied to the Achilles tendon is also able to generate large magnitude extra torques in the triceps surae muscle group. The second objective was to verify if the extra torques that were found were accompanied by increases in motoneuron excitability. Methods Subjects (n = 6) were seated on a chair and the right foot was strapped to a pedal attached to a torque meter. The isometric ankle torque was measured in response to different patterns of coupled electrical (20-Hz, rectangular 1-ms pulses) and mechanical stimuli (either 100-Hz sinusoid or gaussian white noise) applied to the triceps surae muscle group. In an additional investigation, Mmax and F-waves were elicited at different times before or after the vibratory stimulation. Results The vibratory bursts could generate substantial self-sustained extra torques, either with or without the background 20-Hz electrical stimulation applied simultaneously with the vibration. The extra torque generation was accompanied by increased motoneuron excitability, since an increase in the peak-to-peak amplitude of soleus F waves was observed. The delivery of electrical stimulation following the vibration was essential to keep the maintained extra torques and increased F-waves. Conclusions These results show that vibratory stimuli applied with a background electrical stimulation generate considerable force levels (up to about 50% MVC) due to the spinal recruitment of motoneurons. The association of vibration and electrical stimulation could be beneficial for many therapeutic interventions and vibration-based exercise programs. The command for the vibration-induced extra torques presumably activates spinal motoneurons following the size principle, which is a desirable feature for stimulation paradigms.</p

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Universidade de São Paulo

Low-Latency Elliptic Curve Scalar Multiplication

Author: A.K. Lenstra
D. Blythe
D.A. Patterson
D.J. Bernstein
E. Lindholm
H.L. Garner
H.M. Edwards
H.W. Lenstra Jr
J. Nickolls
J.H. Silverman
Joppe W. Bos
M. Garland
M. Segal
N. Koblitz
P.L. Montgomery
R.D. Merrill
R.L. Rivest
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Comparative evaluation of platforms for parallel Ant Colony Optimization

Author: A Delévacq
Antonio Llanes
AR Brodtkorb
B Yu
BP Flannery
BR Ke
E Alba
Ginés D. Guerrero
J Nickolls
JE Stone
JM Cecilia
José M. Cecilia
José M. García
KY Komarudin Wong
M Dorigo
M Dorigo
M Dorigo
M Pedemonte
Manuel Ujaldón
Martyn Amos
MP Garcia
RSS Chang
T Stutzle
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The rapidly growing field of nature-inspired computing concerns the development and application of algorithms and methods based on biological or physical principles. This approach is particularly compelling for practitioners in high-performance computing, as natural algorithms are often inherently parallel in nature (for example, they may be based on a “swarm”-like model that uses a population of agents to optimize a function). Coupled with rising interest in nature-based algorithms is the growth in heterogenous computing; systems that use more than one kind of processor. We are therefore interested in the performance characteristics of nature-inspired algorithms on a number of different platforms. To this end, we present a new OpenCL-based implementation of the Ant Colony Optimization algorithm, and use it as the basis of extensive experimental tests. We benchmark the algorithm against existing implementations, on a wide variety of hardware platforms, and offer extensive analysis. This work provides rigorous foundations for future investigations of Ant Colony Optimization on high-performance platforms

Institutional Repository UCAM

Crossref

Northumbria Research Link

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

E-space: Manchester Metropolitan University's Research Repository

Chemogenomic Analysis of G-Protein Coupled Receptors and Their Ligands Deciphers Locks and Keys Governing Diverse Aspects of Signalling

Author: A Ghose
A Zurn
AJ Harmar
AL Hopkins
Antonius ter Laak
AO Chugunov
AV Smrcka
B Kobilka
B Kuhn
BK Kobilka
CJ van Koppen
CL Worth
D Massotte
D Verzijl
DE Gloriam
DM Perez
DM Rosenbaum
E Jacoby
E Urizar
F Horn
G Kleinau
G Kleinau
G Kleinau
G Vauquelin
Gerd Krause
GF Schertler
Gunnar Kleinau
H Kubinyi
H Unal
HE van der
ID Pogozheva
J Hu
J Huang
J Huynh
J Wess
J Wess
JA Ballesteros
JD Tyndall
JH Park
JR Raymond
JS Surgand
Jörg D. Wichard
K Balakin
K Hogan
K Kristiansen
K Wenzel-Seifert
K Ye
KP Hofmann
L Hu
L Jacob
L Lowell
L Oliveira
L Pardo
Leo Lee
LH Hall
M Conner
MA Cascieri
MA Hanson
MC Lagerstrom
MJ Keiser
MJ Smit
MP Bokoch
N Lehmann
N Van Eps
Nikolaus Heinrich
OM Becker
P Scheerer
P Scheerer
P Tarnow
PD Evans
PW Hildebrand
Q Jiang
R Fredriksson
R Seifert
R Steuer
R Todeschini
RB Westkaemper
RD Finn
Ronald Kühne
RP Bywater
RT Dorsam
S Ahuja
S Bhattacharya
S Moller
S Robb
S Schlyer
S Ye
SA Nickolls
SK Wong
SN Fatakia
SP Runyon
SR Eddy
T Klabunde
T Schoneberg
T Schoneberg
TE Angel
TK Bjarnadottir
TW Schwartz
V Cherezov
VP Jaakola
W Li
WM Oldham
Y Okuno
Y Sun
YK Yang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Understanding the molecular mechanism of signalling in the important super-family of G-protein-coupled receptors (GPCRs) is causally related to questions of how and where these receptors can be activated or inhibited. In this context, it is of great interest to unravel the common molecular features of GPCRs as well as those related to an active or inactive state or to subtype specific G-protein coupling. In our underlying chemogenomics study, we analyse for the first time the statistical link between the properties of G-protein-coupled receptors and GPCR ligands. The technique of mutual information (MI) is able to reveal statistical inter-dependence between variations in amino acid residues on the one hand and variations in ligand molecular descriptors on the other. Although this MI analysis uses novel information that differs from the results of known site-directed mutagenesis studies or published GPCR crystal structures, the method is capable of identifying the well-known common ligand binding region of GPCRs between the upper part of the seven transmembrane helices and the second extracellular loop. The analysis shows amino acid positions that are sensitive to either stimulating (agonistic) or inhibitory (antagonistic) ligand effects or both. It appears that amino acid positions for antagonistic and agonistic effects are both concentrated around the extracellular region, but selective agonistic effects are cumulated between transmembrane helices (TMHs) 2, 3, and ECL2, while selective residues for antagonistic effects are located at the top of helices 5 and 6. Above all, the MI analysis provides detailed indications about amino acids located in the transmembrane region of these receptors that determine G-protein signalling pathway preferences

Crossref

Directory of Open Access Journals

PubMed Central

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data

Author: A Honkela
AA Margolin
AA Margolin
AC Haury
AL Barabasi
AN Brooks
B Usadel
D Angeli
D Braha
D Marbach
D Marbach
F Liu
Guangyong Zheng
I Cantone
J Nickolls
J Zhao
JJ Faith
L Chen
LE Brown
Luonan Chen
M Chevalier
M Grieb
M Zou
N Friedman
N Kramer
PE Meyer
R Bonneau
R Liu
R Ming
R Rabenseifner
S Kauffman
TS Gardner
W Ma
X Yu
X Zhang
X Zhang
X Zhang
X Zhang
Xin-Guang Zhu
Xiujun Zhang
Y Artzy-Randrup
Y Wang
Yaochen Xu
Zhi-Ping Liu
Zhuo Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref