Search CORE

2,450 research outputs found

Recommended from our members

Hybrid Analog-Digital Co-Processing for Scientific Computation

Author: Huang Yipeng
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

In the past 10 years computer architecture research has moved to more heterogeneity and less adherence to conventional abstractions. Scientists and engineers hold an unshakable belief that computing holds keys to unlocking humanity's Grand Challenges. Acting on that belief they have looked deeper into computer architecture to find specialized support for their applications. Likewise, computer architects have looked deeper into circuits and devices in search of untapped performance and efficiency. The lines between computer architecture layers---applications, algorithms, architectures, microarchitectures, circuits and devices---have blurred. Against this backdrop, a menagerie of computer architectures are on the horizon, ones that forgo basic assumptions about computer hardware, and require new thinking of how such hardware supports problems and algorithms. This thesis is about revisiting hybrid analog-digital computing in support of diverse modern workloads. Hybrid computing had extensive applications in early computing history, and has been revisited for small-scale applications in embedded systems. But architectural support for using hybrid computing in modern workloads, at scale and with high accuracy solutions, has been lacking. I demonstrate solving a variety of scientific computing problems, including stochastic ODEs, partial differential equations, linear algebra, and nonlinear systems of equations, as case studies in hybrid computing. I solve these problems on a system of multiple prototype analog accelerator chips built by a team at Columbia University. On that team I made contributions toward programming the chips, building the digital interface, and validating the chips' functionality. The analog accelerator chip is intended for use in conjunction with a conventional digital host computer. The appeal and motivation for using an analog accelerator is efficiency and performance, but it comes with limitations in accuracy and problem sizes that we have to work around. The first problem is how to do problems in this unconventional computation model. Scientific computing phrases problems as differential equations and algebraic equations. Differential equations are a continuous view of the world, while algebraic equations are a discrete one. Prior work in analog computing mostly focused on differential equations; algebraic equations played a minor role in prior work in analog computing. The secret to using the analog accelerator to support modern workloads on conventional computers is that these two viewpoints are interchangeable. The algebraic equations that underlie most workloads can be solved as differential equations, and differential equations are naturally solvable in the analog accelerator chip. A hybrid analog-digital computer architecture can focus on solving linear and nonlinear algebra problems to support many workloads. The second problem is how to get accurate solutions using hybrid analog-digital computing. The reason that the analog computation model gives less accurate solutions is it gives up representing numbers as digital binary numbers, and instead uses the full range of analog voltage and current to represent real numbers. Prior work has established that encoding data in analog signals gives an energy efficiency advantage as long as the analog data precision is limited. While the analog accelerator alone may be useful for energy-constrained applications where inputs and outputs are imprecise, we are more interested in using analog in conjunction with digital for precise solutions. This thesis gives novel insight that the trick to do so is to solve nonlinear problems where low-precision guesses are useful for conventional digital algorithms. The third problem is how to solve large problems using hybrid analog-digital computing. The reason the analog computation model can't handle large problems is it gives up step-by-step discrete-time operation, instead allowing variables to evolve smoothly in continuous time. To make that happen the analog accelerator works by chaining hardware for mathematical operations end-to-end. During computation analog data flows through the hardware with no overheads in control logic and memory accesses. The downside is then the needed hardware size grows alongside problem sizes. While scientific computing researchers have for a long time split large problems into smaller subproblems to fit in digital computer constraints, this thesis is a first attempt to consider these divide-and-conquer algorithms as an essential tool in using the analog model of computation. As we enter the post-Moore’s law era of computing, unconventional architectures will offer specialized models of computation that uniquely support specific problem types. Two prominent examples are deep neural networks and quantum computers. Recent trends in computer science research show these unconventional architectures will soon have broad adoption. In this thesis I show another specialized, unconventional architecture is to use analog accelerators to solve problems in scientific computing. Computer architecture researchers will discover other important models of computation in the future. This thesis is an example of the discovery process, implementation, and evaluation of how an unconventional architecture supports specialized workloads

Columbia University Academic Commons

Automated Translation and Accelerated Solving of Differential Equations on Multiple GPU Platforms

Author: Besard Tim
Churavy Valentin
Edelman Alan
Gerlach Adam R.
Gymnich Tim
Ma Yingbo
Rackauckas Christopher
Utkarsh Utkarsh
Publication venue
Publication date: 13/04/2023
Field of study

We demonstrate a high-performance vendor-agnostic method for massively parallel solving of ensembles of ordinary differential equations (ODEs) and stochastic differential equations (SDEs) on GPUs. The method is integrated with a widely used differential equation solver library in a high-level language (Julia's DifferentialEquations.jl) and enables GPU acceleration without requiring code changes by the user. Our approach achieves state-of-the-art performance compared to hand-optimized CUDA-C++ kernels, while performing

20-100\times

faster than the vectorized-map (\texttt{vmap}) approach implemented in JAX and PyTorch. Performance evaluation on NVIDIA, AMD, Intel, and Apple GPUs demonstrates performance portability and vendor-agnosticism. We show composability with MPI to enable distributed multi-GPU workflows. The implemented solvers are fully featured, supporting event handling, automatic differentiation, and incorporating of datasets via the GPU's texture memory, allowing scientists to take advantage of GPU acceleration on all major current architectures without changing their model code and without loss of performance.Comment: 11 figure

arXiv.org e-Print Archive

Generation of logic designs for efficiently solving ordinary differential equations on field programmable gate arrays

Author: Bartel Silas
Korch Matthias
Publication venue: 'Wiley'
Publication date: 19/10/2021
Field of study

EPub Bayreuth

Computational relativistic quantum dynamics and its application to relativistic tunneling and Kapitza-Dirac scattering

Author: Ahrens Sven
Bauke Heiko
Hatsagortsyan Karen Z.
Keitel Christoph H.
Klaiber Michael
Müller Carsten
Yakaboylu Enderalp
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 07/05/2013
Field of study

Computational methods are indispensable to study the quantum dynamics of relativistic light-matter interactions in parameter regimes where analytical methods become inapplicable. We present numerical methods for solving the time-dependent Dirac equation and the time-dependent Klein-Gordon equation and their implementation on high performance graphics cards. These methods allow us to study tunneling from hydrogen-like highly charged ions in strong laser fields and Kapitza-Dirac scattering in the relativistic regime

arXiv.org e-Print Archive

Crossref

MPG.PuRe

PyFR: An Open Source Framework for Solving Advection-Diffusion Type Problems on Streaming Architectures using the Flux Reconstruction Approach

Author: Farrington Antony M
Vincent Peter E
Witherden Freddie D
Publication venue: 'Elsevier BV'
Publication date: 07/05/2014
Field of study

High-order numerical methods for unstructured grids combine the superior accuracy of high-order spectral or finite difference methods with the geometric flexibility of low-order finite volume or finite element schemes. The Flux Reconstruction (FR) approach unifies various high-order schemes for unstructured grids within a single framework. Additionally, the FR approach exhibits a significant degree of element locality, and is thus able to run efficiently on modern streaming architectures, such as Graphical Processing Units (GPUs). The aforementioned properties of FR mean it offers a promising route to performing affordable, and hence industrially relevant, scale-resolving simulations of hitherto intractable unsteady flows within the vicinity of real-world engineering geometries. In this paper we present PyFR, an open-source Python based framework for solving advection-diffusion type problems on streaming architectures using the FR approach. The framework is designed to solve a range of governing systems on mixed unstructured grids containing various element types. It is also designed to target a range of hardware platforms via use of an in-built domain specific language based on the Mako templating engine. The current release of PyFR is able to solve the compressible Euler and Navier-Stokes equations on grids of quadrilateral and triangular elements in two dimensions, and hexahedral elements in three dimensions, targeting clusters of CPUs, and NVIDIA GPUs. Results are presented for various benchmark flow problems, single-node performance is discussed, and scalability of the code is demonstrated on up to 104 NVIDIA M2090 GPUs. The software is freely available under a 3-Clause New Style BSD license (see www.pyfr.org)

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Spiral - Imperial College Digital Repository