447 research outputs found
Performance potential for simulating spin models on GPU
Graphics processing units (GPUs) are recently being used to an increasing
degree for general computational purposes. This development is motivated by
their theoretical peak performance, which significantly exceeds that of broadly
available CPUs. For practical purposes, however, it is far from clear how much
of this theoretical performance can be realized in actual scientific
applications. As is discussed here for the case of studying classical spin
models of statistical mechanics by Monte Carlo simulations, only an explicit
tailoring of the involved algorithms to the specific architecture under
consideration allows to harvest the computational power of GPU systems. A
number of examples, ranging from Metropolis simulations of ferromagnetic Ising
models, over continuous Heisenberg and disordered spin-glass systems to
parallel-tempering simulations are discussed. Significant speed-ups by factors
of up to 1000 compared to serial CPU code as well as previous GPU
implementations are observed.Comment: 28 pages, 15 figures, 2 tables, version as publishe
The use of primitives in the calculation of radiative view factors
Compilations of radiative view factors (often in closed analytical form) are readily available in the open literature for commonly encountered geometries. For more complex three-dimensional (3D) scenarios, however, the effort required to solve the requisite multi-dimensional integrations needed to estimate a required view factor can be daunting to say the least. In such cases, a combination of finite element methods (where the geometry in question is sub-divided into a large number of uniform, often triangular, elements) and Monte Carlo Ray Tracing (MC-RT) has been developed, although frequently the software implementation is suitable only for a limited set of geometrical scenarios. Driven initially by a need to calculate the radiative heat transfer occurring within an operational fibre-drawing furnace, this research set out to examine options whereby MC-RT could be used to cost-effectively calculate any generic 3D radiative view factor using current vectorisation technologies
A Bayesian Heteroscedastic GLM with Application to fMRI Data with Motion Spikes
We propose a voxel-wise general linear model with autoregressive noise and
heteroscedastic noise innovations (GLMH) for analyzing functional magnetic
resonance imaging (fMRI) data. The model is analyzed from a Bayesian
perspective and has the benefit of automatically down-weighting time points
close to motion spikes in a data-driven manner. We develop a highly efficient
Markov Chain Monte Carlo (MCMC) algorithm that allows for Bayesian variable
selection among the regressors to model both the mean (i.e., the design matrix)
and variance. This makes it possible to include a broad range of explanatory
variables in both the mean and variance (e.g., time trends, activation stimuli,
head motion parameters and their temporal derivatives), and to compute the
posterior probability of inclusion from the MCMC output. Variable selection is
also applied to the lags in the autoregressive noise process, making it
possible to infer the lag order from the data simultaneously with all other
model parameters. We use both simulated data and real fMRI data from OpenfMRI
to illustrate the importance of proper modeling of heteroscedasticity in fMRI
data analysis. Our results show that the GLMH tends to detect more brain
activity, compared to its homoscedastic counterpart, by allowing the variance
to change over time depending on the degree of head motion
Efficient reconfigurable architectures for 3D medical image compression
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Recently, the more widespread use of three-dimensional (3-D) imaging modalities,
such as magnetic resonance imaging (MRI), computed tomography (CT), positron
emission tomography (PET), and ultrasound (US) have generated a massive amount
of volumetric data. These have provided an impetus to the development of other
applications, in particular telemedicine and teleradiology. In these fields, medical
image compression is important since both efficient storage and transmission of data
through high-bandwidth digital communication lines are of crucial importance.
Despite their advantages, most 3-D medical imaging algorithms are computationally intensive with matrix transformation as the most fundamental operation involved in the transform-based methods. Therefore, there is a real need for high-performance systems, whilst keeping architectures exible to allow
for quick upgradeability with real-time applications. Moreover, in order to obtain
efficient solutions for large medical volumes data, an efficient implementation of
these operations is of significant importance. Reconfigurable hardware, in the form of field programmable gate arrays (FPGAs) has been proposed as viable system
building block in the construction of high-performance systems at an economical price.
Consequently, FPGAs seem an ideal candidate to harness and exploit their inherent
advantages such as massive parallelism capabilities, multimillion gate counts, and
special low-power packages. The key achievements of the work presented in this thesis are summarised as follows. Two architectures for 3-D Haar wavelet transform (HWT) have been proposed based on transpose-based computation and partial reconfiguration suitable for 3-D medical imaging applications. These applications require continuous hardware servicing, and as a result dynamic partial reconfiguration (DPR) has been introduced. Comparative study for both non-partial and partial reconfiguration implementation has shown that DPR offers many advantages and leads to a compelling solution for implementing computationally intensive applications such as 3-D medical image compression. Using DPR, several large systems are mapped to small hardware resources, and the area, power consumption as well as maximum frequency are
optimised and improved. Moreover, an FPGA-based architecture of the finite Radon transform (FRAT)with three design strategies has been proposed: direct implementation of pseudo-code with a sequential or pipelined description, and block random access memory (BRAM)- based method. An analysis with various medical imaging modalities has been carried out. Results obtained for image de-noising implementation using FRAT exhibits
promising results in reducing Gaussian white noise in medical images. In terms of
hardware implementation, promising trade-offs on maximum frequency, throughput
and area are also achieved. Furthermore, a novel hardware implementation of 3-D medical image compression system with context-based adaptive variable length coding (CAVLC)
has been proposed. An evaluation of the 3-D integer transform (IT) and the discrete
wavelet transform (DWT) with lifting scheme (LS) for transform blocks reveal that
3-D IT demonstrates better computational complexity than the 3-D DWT, whilst
the 3-D DWT with LS exhibits a lossless compression that is significantly useful for
medical image compression. Additionally, an architecture of CAVLC that is capable
of compressing high-definition (HD) images in real-time without any buffer between
the quantiser and the entropy coder is proposed. Through a judicious parallelisation, promising results have been obtained with limited resources. In summary, this research is tackling the issues of massive 3-D medical volumes data that requires compression as well as hardware implementation to accelerate the
slowest operations in the system. Results obtained also reveal a significant achievement in terms of the architecture efficiency and applications performance.Ministry of Higher Education Malaysia (MOHE),
Universiti Tun Hussein Onn Malaysia (UTHM) and the British Counci
- …