1,302 research outputs found
Accelerating Reconfigurable Financial Computing
This thesis proposes novel approaches to the design, optimisation, and management of reconfigurable
computer accelerators for financial computing. There are three contributions. First, we propose novel
reconfigurable designs for derivative pricing using both Monte-Carlo and quadrature methods. Such
designs involve exploring techniques such as control variate optimisation for Monte-Carlo, and multi-dimensional
analysis for quadrature methods. Significant speedups and energy savings are achieved
using our Field-Programmable Gate Array (FPGA) designs over both Central Processing Unit (CPU)
and Graphical Processing Unit (GPU) designs. Second, we propose a framework for distributing computing
tasks on multi-accelerator heterogeneous clusters. In this framework, different computational
devices including FPGAs, GPUs and CPUs work collaboratively on the same financial problem based
on a dynamic scheduling policy. The trade-off in speed and in energy consumption of different accelerator
allocations is investigated. Third, we propose a mixed precision methodology for optimising
Monte-Carlo designs, and a reduced precision methodology for optimising quadrature designs. These
methodologies enable us to optimise throughput of reconfigurable designs by using datapaths with
minimised precision, while maintaining the same accuracy of the results as in the original designs
Accelerating Exact Stochastic Simulation of Biochemical Systems
The ability to accurately and efficiently simulate computer models of biochemical systems is of growing importance to the molecular biology and pharmaceutical research communities. Exact stochastic simulation is a popular approach for simulating such systems because it properly represents genetic noise and it accurately represents systems with small populations of chemical species. Unfortunately, the computational demands of exact stochastic simulation often limit its applicability. To enable next-generation whole-cell and multi-cell stochastic modeling, advanced tools and techniques must be developed to increase simulation efficiency. This work assesses the applicability of a variety of hardware and software acceleration approaches for exact stochastic simulation including serial algorithm improvements, parallel computing, reconfigurable computing, and cluster computing. Through this analysis, improved simulation techniques for biological systems are explored and evaluated
Scalable Emulation of Sign-ProblemFree Hamiltonians with Room Temperature p-bits
The growing field of quantum computing is based on the concept of a q-bit
which is a delicate superposition of 0 and 1, requiring cryogenic temperatures
for its physical realization along with challenging coherent coupling
techniques for entangling them. By contrast, a probabilistic bit or a p-bit is
a robust classical entity that fluctuates between 0 and 1, and can be
implemented at room temperature using present-day technology. Here, we show
that a probabilistic coprocessor built out of room temperature p-bits can be
used to accelerate simulations of a special class of quantum many-body systems
that are sign-problemfree or stoquastic, leveraging the well-known
Suzuki-Trotter decomposition that maps a -dimensional quantum many body
Hamiltonian to a +1-dimensional classical Hamiltonian. This mapping allows
an efficient emulation of a quantum system by classical computers and is
commonly used in software to perform Quantum Monte Carlo (QMC) algorithms. By
contrast, we show that a compact, embedded MTJ-based coprocessor can serve as a
highly efficient hardware-accelerator for such QMC algorithms providing several
orders of magnitude improvement in speed compared to optimized CPU
implementations. Using realistic device-level SPICE simulations we demonstrate
that the correct quantum correlations can be obtained using a classical
p-circuit built with existing technology and operating at room temperature. The
proposed coprocessor can serve as a tool to study stoquastic quantum many-body
systems, overcoming challenges associated with physical quantum annealers.Comment: Fixed minor typos and expanded Appendi
Reconfigurable Hardware Acceleration of Exact Stochastic Simulation
This thesis explores the use of reconfigurable hardware in modeling chemical species reacting in a spatially homogeneous environment. The time evolution of biochemical models is often evaluated using a deterministic approach that uses differential equations to describe the chemical interactions of the model. However, such an approach treats species as continuous valued concentrations, is inaccurate for small species populations, and neglects the stochastic nature of biochemical systems. The Stochastic Simulation Algorithm (SSA) developed by Gillespie is able to properly account for these inherent noise fluctuations. This allows the SSA to accurately project the time evolution of a biochemical model. Unfortunately, the SSA can be computationally intensive and require a substantial amount of time to complete.
Therefore, it has been proposed that the SSA be implemented on a Field Programmable Gate Array (FPGA) to improve performance. Employing an FPGA allows parallelism to be exploited within the SSA providing a speedup over software implementations executing instructions sequentially. Recent work in this area has focused on implementing the SSA on an FPGA to simulate specific biochemical models. However, this requires re-constructing and re-synthesizing the design in order to simulate a new biochemical system. This work examines the use of a reconfigurable computing platform to allow an implementation of the SSA on an FPGA to simulate a variety of models. The designs presented herein demonstrate a speedup of roughly 1.5X
Hardware Implementation of Deep Network Accelerators Towards Healthcare and Biomedical Applications
With the advent of dedicated Deep Learning (DL) accelerators and neuromorphic
processors, new opportunities are emerging for applying deep and Spiking Neural
Network (SNN) algorithms to healthcare and biomedical applications at the edge.
This can facilitate the advancement of the medical Internet of Things (IoT)
systems and Point of Care (PoC) devices. In this paper, we provide a tutorial
describing how various technologies ranging from emerging memristive devices,
to established Field Programmable Gate Arrays (FPGAs), and mature Complementary
Metal Oxide Semiconductor (CMOS) technology can be used to develop efficient DL
accelerators to solve a wide variety of diagnostic, pattern recognition, and
signal processing problems in healthcare. Furthermore, we explore how spiking
neuromorphic processors can complement their DL counterparts for processing
biomedical signals. After providing the required background, we unify the
sparsely distributed research on neural network and neuromorphic hardware
implementations as applied to the healthcare domain. In addition, we benchmark
various hardware platforms by performing a biomedical electromyography (EMG)
signal processing task and drawing comparisons among them in terms of inference
delay and energy. Finally, we provide our analysis of the field and share a
perspective on the advantages, disadvantages, challenges, and opportunities
that different accelerators and neuromorphic processors introduce to healthcare
and biomedical domains. This paper can serve a large audience, ranging from
nanoelectronics researchers, to biomedical and healthcare practitioners in
grasping the fundamental interplay between hardware, algorithms, and clinical
adoption of these tools, as we shed light on the future of deep networks and
spiking neuromorphic processing systems as proponents for driving biomedical
circuits and systems forward.Comment: Submitted to IEEE Transactions on Biomedical Circuits and Systems (21
pages, 10 figures, 5 tables
- …