Search CORE

23 research outputs found

Plasma Physics Computations on Emerging Hardware Architectures

Author: CHORLEY JOANNE,CLARE
Publication venue
Publication date: 01/01/2016
Field of study

This thesis explores the potential of emerging hardware architectures to increase the impact of high performance computing in fusion plasma physics research. For next generation tokamaks like ITER, realistic simulations and data-processing tasks will become significantly more demanding of computational resources than current facilities. It is therefore essential to investigate how emerging hardware such as the graphics processing unit (GPU) and field-programmable gate array (FPGA) can provide the required computing power for large data-processing tasks and large scale simulations in plasma physics specific computations. The use of emerging technology is investigated in three areas relevant to nuclear fusion: (i) a GPU is used to process the large amount of raw data produced by the synthetic aperture microwave imaging (SAMI) plasma diagnostic, (ii) the use of a GPU to accelerate the solution of the Bateman equations which model the evolution of nuclide number densities when subjected to neutron irradiation in tokamaks, and (iii) an FPGA-based dataflow engine is applied to compute massive matrix multiplications, a feature of many computational problems in fusion and more generally in scientific computing. The GPU data processing code for SAMI provides a 60x acceleration over the previous IDL-based code, enabling inter-shot analysis in future campaigns and the data-mining (and therefore analysis) of stored raw data from previous MAST campaigns. The feasibility of porting the whole Bateman solver to a GPU system is demonstrated and verified against the industry standard FISPACT code. Finally a dataflow approach to matrix multiplication is shown to provide a substantial acceleration compared to CPU-based approaches and, whilst not performing as well as a GPU for this particular problem, is shown to be much more energy efficient. Emerging hardware technologies will no doubt continue to provide a positive contribution in terms of performance to many areas of fusion research and several exciting new developments are on the horizon with tighter integration of GPUs and FPGAs with their host central processor units. This should not only improve performance and reduce data transfer bottlenecks, but also allow more user-friendly programming tools to be developed. All of this has implications for ITER and beyond where emerging hardware technologies will no doubt provide the key to delivering the computing power required to handle the large amounts of data and more realistic simulations demanded by these complex systems

Durham e-Theses

Hardware implementation of non-bonded forces in molecular dynamics simulations

Author: Caicedo Beltrán Álvaro José
Publication venue: INGENIERIA ELECTRÓNICA
Publication date: 01/01/2011
Field of study

Molecular Dynamics is a computational method based on classical mechanics to describe the behavior of a molecular system. This method is used in biomolecular simulations, which are intended to contribute to the study and advance of nanotechnology, medicine, chemistry and biology. Software implementations of Molecular Dynamics simulations can spend most of time computing the non-bonded interactions. This work presents the design and implementation of an FPGA-based coprocessor that accelerates MD simulations by computing in parallel the non-bonded interactions, specifically, the van der Waals and the electrostatic interactions. These interactions are modeled as the Lennard-Jones 6-12 potential and the direct-space Ewald summation, respectively. In addition, this work introduces a novel variable transformation of the potential energy functions, and a novel interpolation method with pseudo-floating-point representation to compute the short-range forces. Also, it uses a combination of fixed-point and floating-point arithmetic to obtain the best of both representations. The FPGA coprocessor is a memory-mapped system connected to a host by PCI Express, and is provided with interruption capabilities to improve parallelization. Its main block is based on a single functional pipeline, and is connected via Avalon Bus to other peripherals such as the PCIe Hard-IP and the SG-DMA. It is implemented on an Altera¿s EP2AGX125EF35C4 device, can process 16k particles, and is configured to store up to 16 different types of particles. Simulations in a custom C-application for MD that only computes non-bonded forces become up to 12.5x faster using the FPGA coprocessor when considering 12500 atoms.PregradoINGENIERO(A) EN ELECTRÓNIC

Biblioteca Digital de la Universidad del Valle

Recommended from our members

Laboratory Directed Research and Development Program FY 2001

Author: Lawrence Berkeley National Laboratory
Publication venue: eScholarship, University of California
Publication date: 01/03/2002
Field of study

eScholarship - University of California

Bibliography of Lewis Research Center technical publications announced in 1984

Author
Publication venue
Publication date
Field of study

This compilation of abstracts describes and indexes the technical reporting that resulted from the scientific and engineering work performed and managed by the Lewis Research Center in 1984. All the publications were announced in the 1984 issues of STAR (Scientific and Technical Aerospace Reports) and/or IAA (International Aerospace Abstracts). Included are research reports, journal articles, conference presentations, patents and patent applications, and theses

NASA Technical Reports Server

Recommended from our members

Laboratory Directed Research and Development Program FY 2000

Author: Lawrence Berkeley National Laboratory
Publication venue: eScholarship, University of California
Publication date: 01/02/2001
Field of study

eScholarship - University of California

2022 roadmap on neuromorphic computing and engineering

Author: Christensen Dennis V
Datta Suman
Dittmann Regina
et al
Feldmann Johannes
Grollier Julie
Indiveri Giacomo
Keene Scott T
Lanza Mario
Le Gallo Manuel
Liang Shi-Jun
Linares-Barranco Bernabe
Marković Danijela
Menzel Stephan
Miao Feng
Mikolajick Thomas
Milano Gianluca
Mizrahi Alice
Quill Tyler J
Redaelli Andrea
Ricciardi Carlo
Salleo Alberto
Sebastian Abu
Slesazeck Stefan
Spiga Sabina
Strachan John Paul
Valentian Alexandre
Valov Ilia
Vianello Elisa
Yang J Joshua
Yao Peng
Publication venue: 'IOP Publishing'
Publication date: 01/06/2022
Field of study

Modern computation based on von Neumann architecture is now a mature cutting-edge science. In the von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 10

^{18}

calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices. The aim of this roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges for each research area. We hope that this roadmap will be a useful resource by providing a concise yet comprehensive introduction to readers outside this field, for those who are just entering the field, as well as providing future perspectives for those who are well established in the neuromorphic computing community

ZORA

2022 roadmap on neuromorphic computing and engineering

Author: et al
Furber Steve
Publication venue: 'IOP Publishing'
Publication date: 20/05/2022
Field of study

The University of Manchester - Institutional Repository

Synthèse et description de circuits numériques au niveau des transferts synchronisés par les données

Author: Daigneault Marc-André
Publication venue
Publication date: 01/12/2015
Field of study

RÉSUMÉ Au-delà des processeurs d’instructions multi-coeurs, le monde du traitement numérique haute performance moderne est également caractérisé par l’utilisation de circuits spécifiques à un domaine d’application implémentés au moyen de circuits programmables FPGA (réseau de portes programmables in situ). Les FPGA représentent des candidats intéressants à la réalisation de calculs haute-performances pour différentes raisons. D’une part, le nombre importants de blocs de propriétés intellectuelles gravés en dur sur ces puces (processeurs, mémoires, unités de traitement de signal numérique) réduit l’écart qui les sépare des circuits intégrés dédiés en termes de ressources disponibles. Un écart qui s’explique par le haut niveau de configurabilité offert par le circuit programmable, une capacité pour laquelle un grand nombre de ressources doit être dédié sans être utilisé par le circuit programmé. Néanmoins dans un contexte où souvent plus de transistors sont disponibles qu’on puisse en utiliser, le coût associé à la configurabilité s’en trouve d’autant réduit. De par leur capacité à être reconfigurés complètement ou partiellement, les FPGAs modernes, tout comme les processeurs d’instructions, offrent la flexibilité requise pour supporter un grand nombre d’applications. Néanmoins, contrairement aux processeurs d’instructions qui peuvent être programmés avec différents langages de programmation haut-niveau (Java, C#, C/C++, MPI, OpenMP, OpenCL), la programmation d’un FPGA requiert la spécification d’un circuit numérique, ce qui représente un obstacle majeur à leur plus grande adoption. La description de circuits numériques est généralement exprimée au moyen d’un langage concurrent pour lequel le niveau d’abstraction se situe au niveau des transferts entre registres (RTL), tels les langages VHDL et Verilog. Pour une application donnée, la réalisation d’un circuit numérique spécialisé requiert typiquement un effort de conception significativement plus grand qu’une réalisation logicielle. Il existe aujourd’hui différents outils académiques et commerciaux permettant la synthèse haut-niveau de circuits numériques en partant de descriptions C/C++/SystemC, et plus récemment OpenCL. Cependant, selon l’application considérée, ces outils ne permettent pas toujours d’obtenir des performances comparables à celles qui peuvent être obtenues avec une description RTL produite manuellement. On s’intéresse dans ce travail à un outil de synthèse de niveau intermédiaire offrant un compromis entre les performances atteignables au moyen d’une méthode de conception RTL, ainsi que les temps de conception que permet la synthèse à haut-niveau.----------ABSTRACT Beyond modern multi/many-cores processors, the world of computing is also caracterized by the use of dedicated circuits implemented on Field-Programmable Gate-Arrays (FPGAs). For many reasons, modern FPGAs have become interesting targets for high-performance computing applications. On one hand, their integration of considerable amounts of IP blocks (processors, memories, DSPs) has contributed to reduce the resource/performance gap that exist with Application Specific Integrated Devices (ASICs). A gap that is easily explained by the high-level of reconfigurability that these devices provide, a feature for which a considerable amount of resources (transistors) must be dedicated. Nevertheless, in a context where often more transistors are often available than it is needed or required, the impact of such a cost is less important. The ability to reconfigure completely or partially modern FPGAs further offer the flexibility required to support multiple different applications over time, similarly to instruction processors. However, while instruction processors can be programmed with different high abstraction level software programming languages (Java, C#, C/C++, MPI, OpenMP, OpenCL), FPGA programming typically requires the specification of a hardware design, which is a major obstacle to their widespread use. The description of a hardware design is generally done at the register-transfer level (RTL), using hardware description languages (HDLs) such as VHDL and Verilog. For a given application, the design and verification of a dedicated circuit requires a significantly more important effort than a software implementation. Nowadays, numerous commercial and academic tools allow the high-level synthesis of hardware designs starting from a software description using programming languages such as C/C++/SystemC, and more recently OpenCL. Nevertheless, depending on the application considered, at current state of the art, these tools do not allow performances that matches those which can be obtained through hand-made RTL designs. In this work, we consider an intermediate-level synthesis methodology offering a compromise between the performances and design times that can be obtained with RTL and high-level synthesis methodologies. We consider an input hardware description language that allows the description of algorithmic state machines (ASMs) handling connections between sources and sinks with predefined streaming interfaces. These interfaces are similar AXI4-Streaming and Avalon-Streaming interfaces, featuring ready-to-send/ready-to-receive synchronisation signals

PolyPublie

College of Engineering

Author: Cornell University
Publication venue: 'SAGE Publications'
Publication date: 01/01/2010
Field of study

Cornell University Courses of Study Vol. 102 2010/201

eCommons@Cornell