Search CORE

48 research outputs found

Accelerated Event-by-Event Neutrino Oscillation Reweighting with Matter Effects on a GPU

Author: A C Kaboth
D Payne
Khronos OpenCL Working Group
N. Whitehead
NVIDIA Corporation
OpenMP Architecture Review Board
P. Pomorski
R G Calland
R. Wendell
Publication venue: 'IOP Publishing'
Publication date: 29/11/2013
Field of study

Oscillation probability calculations are becoming increasingly CPU intensive in modern neutrino oscillation analyses. The independency of reweighting individual events in a Monte Carlo sample lends itself to parallel implementation on a Graphics Processing Unit. The library "Prob3++" was ported to the GPU using the CUDA C API, allowing for large scale parallelized calculations of neutrino oscillation probabilities through matter of constant density, decreasing the execution time by a factor of 75, when compared to performance on a single CPU.Comment: Final Update: Post submission update Updated version: quantified the difference in event rates for binned and event-by-event reweighting with a typical binning scheme. Improved formatting of reference

arXiv.org e-Print Archive

Crossref

Royal Holloway - Pure

Parallel perfusion imaging processing using GPGPU

Author: Alsop
Calamante
Calamante
Calamante
Cox
David Rodriguez Gonzalez
Duhamel
Fan Zhu
Gobbel
Gore
Gropp
Joanna Wardlaw
Kerr
Lahabar
Lindholm
Liu
Lorenz
Malcolm Atkinson
Mayer
Meier
OpenMP
Ostergaard
Ostergaard
O’Haver
Rivers
Rivers
Saver
Trevor Carpenter
Wintermark
Wirestam
Wu
Publication venue: 'Elsevier BV'
Publication date: 01/12/2012
Field of study

AbstractBackground and purposeThe objective of brain perfusion quantification is to generate parametric maps of relevant hemodynamic quantities such as cerebral blood flow (CBF), cerebral blood volume (CBV) and mean transit time (MTT) that can be used in diagnosis of acute stroke. These calculations involve deconvolution operations that can be very computationally expensive when using local Arterial Input Functions (AIF). As time is vitally important in the case of acute stroke, reducing the analysis time will reduce the number of brain cells damaged and increase the potential for recovery.MethodsGPUs originated as graphics generation dedicated co-processors, but modern GPUs have evolved to become a more general processor capable of executing scientific computations. It provides a highly parallel computing environment due to its large number of computing cores and constitutes an affordable high performance computing method. In this paper, we will present the implementation of a deconvolution algorithm for brain perfusion quantification on GPGPU (General Purpose Graphics Processor Units) using the CUDA programming model. We present the serial and parallel implementations of such algorithms and the evaluation of the performance gains using GPUs.ResultsOur method has gained a 5.56 and 3.75 speedup for CT and MR images respectively.ConclusionsIt seems that using GPGPU is a desirable approach in perfusion imaging analysis, which does not harm the quality of cerebral hemodynamic maps but delivers results faster than the traditional computation

Elsevier - Publisher Connector

Crossref

PubMed Central

Edinburgh Research Explorer

An integrated approach to correction for off-resonance effects and subject movement in diffusion MR imaging

Author: Alexander
Alexander
Andersson
Andersson
Andersson
Andersson
Andersson
Andersson
Andersson
Andersson
Assaf
Baron
Bastin
Bastin
Bastin
Bodammer
Calamante
Calamante
Chan
Chen
Descoteaux
Embleton
Farber
Filippini
Finsterbusch
Graham
Haselgrove
Horsfield
Janke
Jenkinson
Jesper L.R. Andersson
Jezzard
Jezzard
Jones
Kim
Larkman
Le Bihan
Leemans
Maclaren
McNab
McNab
Moeller
Mohammadi
Morgan
OpenMP Architecture Review Board
Pierpaoli
Press
Rasmussen
Reese
Rohde
Setsompop
Setsompop
Shen
Skare
Smith
Sotiropoulos
Sotiropoulos
Sotiropoulos
Stamatios N. Sotiropoulos
Storey
Truong
Unser
Unser
Uğurbil
Van Essen
Vu
Wilm
Xu
Zhuang
Zhuang
Publication venue: 'Elsevier BV'
Publication date: 24/10/2015
Field of study

In this paper we describe a method for retrospective estimation and correction of eddy current (EC)-induced distortions and subject movement in diffusion imaging. In addition a susceptibility-induced field can be supplied and will be incorporated into the calculations in a way that accurately reflects that the two fields (susceptibility- and EC-induced) behave differently in the presence of subject movement. The method is based on registering the individual volumes to a model free prediction of what each volume should look like, thereby enabling its use on high b-value data where the contrast is vastly different in different volumes. In addition we show that the linear EC-model commonly used is insufficient for the data used in the present paper (high spatial and angular resolution data acquired with Stejskal–Tanner gradients on a 3 T Siemens Verio, a 3 T Siemens Connectome Skyra or a 7 T Siemens Magnetome scanner) and that a higher order model performs significantly better. The method is already in extensive practical use and is used by four major projects (the WU-UMinn HCP, the MGH HCP, the UK Biobank and the Whitehall studies) to correct for distortions and subject movement

Nottingham ePrints

Nottingham eTheses

Elsevier - Publisher Connector

Crossref

Repository@Nottingham

PubMed Central

Accelerated large-scale multiple sequence alignment

Author: A Szalkowski
A Wilm
A Wirawan
AV Bhatt
C Grasso
C Notredame
D Mikhailov
DF Feng
E Eskin
G Tan
GM Amdahl
H Carroll
H Vandierendonck
I Letunic
J Cheetham
J Ebedes
J Nickolls
JD Thompson
JD Thompson
JD Thompson
K Katoh
KB Li
M Farrar
M Feldman
M Friedman
OpenMP
Quinn O Snell
RC Edgar
S Lloyd
S Washietl
Scott Lloyd
SR Eddy
T Lassmann
T Oliver
T Ramdas
T Wang
X Deng
X Lin
Y Li
Y Liu
Y Liu
Publication venue: BioMed Central
Publication date: 01/12/2011
Field of study

Abstract Background Multiple sequence alignment (MSA) is a fundamental analysis method used in bioinformatics and many comparative genomic applications. Prior MSA acceleration attempts with reconfigurable computing have only addressed the first stage of progressive alignment and consequently exhibit performance limitations according to Amdahl's Law. This work is the first known to accelerate the third stage of progressive alignment on reconfigurable hardware. Results We reduce subgroups of aligned sequences into discrete profiles before they are pairwise aligned on the accelerator. Using an FPGA accelerator, an overall speedup of up to 150 has been demonstrated on a large data set when compared to a 2.4 GHz Core2 processor. Conclusions Our parallel algorithm and architecture accelerates large-scale MSA with reconfigurable computing and allows researchers to solve the larger problems that confront biologists today. Program source is available from <url>http://dna.cs.byu.edu/msa/</url>.</p

Crossref

Directory of Open Access Journals

PubMed Central

Distributed Block Coordinate Descent for Minimizing Partially Separable Functions

Author: A. Saha
B.K. Natarajan
C. Scherrer
D. Ge
D.D. Lewis
D.P. Bertsekas
D.P. Bertsekas
E.Y. Chang
F. Niu
N.K. Alham
O. Fercoq
OpenMP Architecture Review Board
P. Richtárik
P. Tseng
P. Tseng
P. Tseng
S. Shalev-Shwartz
S. Shalev-Shwartz
Y. Nesterov
Publication venue
Publication date: 02/06/2014
Field of study

In this work we propose a distributed randomized block coordinate descent method for minimizing a convex function with a huge number of variables/coordinates. We analyze its complexity under the assumption that the smooth part of the objective function is partially block separable, and show that the degree of separability directly influences the complexity. This extends the results in [Richtarik, Takac: Parallel coordinate descent methods for big data optimization] to a distributed environment. We first show that partially block separable functions admit an expected separable overapproximation (ESO) with respect to a distributed sampling, compute the ESO parameters, and then specialize complexity results from recent literature that hold under the generic ESO assumption. We describe several approaches to distribution and synchronization of the computation across a cluster of multi-core computers and provide promising computational results.Comment: in Recent Developments in Numerical Analysis and Optimization, 201

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Multicore-based 3D-DWT video encoder

Author: BJ Kim
BJ Kim
BM Sunil
CI Podilchuk
D Taubman
ISO/IEC 14496–10 and ITU Rec H.264. Coding of audio-visual objects - Part 10
ITU-T Recommendation H.263
J Guo
J Luo
JM Shapiro
L Ye
M Aviles
M Bibhuprasad
OpenMP application program interface
P Campisi
P Schelkens
PL Dragotti
R Jang-Seon
SG Mallat
V Galiano
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref