Search CORE

5,325 research outputs found

Interstellar: Using Halide's Scheduling Language to Analyze DNN Accelerators

Author: Bell Steven Emberton
Cao Kaidi
Gao Mingyu
Ha Heonjae
Horowitz Mark
Kozyrakis Christos
Liu Qiaoyi
Nayak Ankita
Pu Jing
Raina Priyanka
Setter Jeff Ou
Yang Xuan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/04/2020
Field of study

We show that DNN accelerator micro-architectures and their program mappings represent specific choices of loop order and hardware parallelism for computing the seven nested loops of DNNs, which enables us to create a formal taxonomy of all existing dense DNN accelerators. Surprisingly, the loop transformations needed to create these hardware variants can be precisely and concisely represented by Halide's scheduling language. By modifying the Halide compiler to generate hardware, we create a system that can fairly compare these prior accelerators. As long as proper loop blocking schemes are used, and the hardware can support mapping replicated loops, many different hardware dataflows yield similar energy efficiency with good performance. This is because the loop blocking can ensure that most data references stay on-chip with good locality and the processing units have high resource utilization. How resources are allocated, especially in the memory system, has a large impact on energy and performance. By optimizing hardware resource allocation while keeping throughput constant, we achieve up to 4.2X energy improvement for Convolutional Neural Networks (CNNs), 1.6X and 1.8X improvement for Long Short-Term Memories (LSTMs) and multi-layer perceptrons (MLPs), respectively.Comment: Published as a conference paper at ASPLOS 202

arXiv.org e-Print Archive

Crossref

Advances in computational modelling for personalised medicine after myocardial infarction

Author: Berry Colin
Gao Hao
Husmeier Dirk
Luo Xiaoyu
Mangion Kenneth
Publication venue: 'BMJ'
Publication date: 10/11/2017
Field of study

Myocardial infarction (MI) is a leading cause of premature morbidity and mortality worldwide. Determining which patients will experience heart failure and sudden cardiac death after an acute MI is notoriously difficult for clinicians. The extent of heart damage after an acute MI is informed by cardiac imaging, typically using echocardiography or sometimes, cardiac magnetic resonance (CMR). These scans provide complex data sets that are only partially exploited by clinicians in daily practice, implying potential for improved risk assessment. Computational modelling of left ventricular (LV) function can bridge the gap towards personalised medicine using cardiac imaging in patients with post-MI. Several novel biomechanical parameters have theoretical prognostic value and may be useful to reflect the biomechanical effects of novel preventive therapy for adverse remodelling post-MI. These parameters include myocardial contractility (regional and global), stiffness and stress. Further, the parameters can be delineated spatially to correspond with infarct pathology and the remote zone. While these parameters hold promise, there are challenges for translating MI modelling into clinical practice, including model uncertainty, validation and verification, as well as time-efficient processing. More research is needed to (1) simplify imaging with CMR in patients with post-MI, while preserving diagnostic accuracy and patient tolerance (2) to assess and validate novel biomechanical parameters against established prognostic biomarkers, such as LV ejection fraction and infarct size. Accessible software packages with minimal user interaction are also needed. Translating benefits to patients will be achieved through a multidisciplinary approach including clinicians, mathematicians, statisticians and industry partners

Crossref

Enlighten

The Lisbeth Hockey Community Nursing Research Training Fellowship 2008:Final report to the Queens Nursing Institute Scotland

Author: Ferguson Dorothy
Kerr Susan
Lawrence Maggie
McVey Caroline
Publication venue: Glasgow Caledonian University
Publication date: 01/01/2011
Field of study

ResearchOnline@GCU

Transformations of High-Level Synthesis Codes for High-Performance Computing

Author: Besta Maciej
Hoefler Torsten
Licht Johannes de Fine
Meierhans Simon
Publication venue
Publication date: 29/10/2019
Field of study

Specialized hardware architectures promise a major step in performance and energy efficiency over the traditional load/store devices currently employed in large scale computing systems. The adoption of high-level synthesis (HLS) from languages such as C/C++ and OpenCL has greatly increased programmer productivity when designing for such platforms. While this has enabled a wider audience to target specialized hardware, the optimization principles known from traditional software design are no longer sufficient to implement high-performance codes. Fast and efficient codes for reconfigurable platforms are thus still challenging to design. To alleviate this, we present a set of optimizing transformations for HLS, targeting scalable and efficient architectures for high-performance computing (HPC) applications. Our work provides a toolbox for developers, where we systematically identify classes of transformations, the characteristics of their effect on the HLS code and the resulting hardware (e.g., increases data reuse or resource consumption), and the objectives that each transformation can target (e.g., resolve interface contention, or increase parallelism). We show how these can be used to efficiently exploit pipelining, on-chip distributed fast memory, and on-chip streaming dataflow, allowing for massively parallel architectures. To quantify the effect of our transformations, we use them to optimize a set of throughput-oriented FPGA kernels, demonstrating that our enhancements are sufficient to scale up parallelism within the hardware constraints. With the transformations covered, we hope to establish a common framework for performance engineers, compiler developers, and hardware developers, to tap into the performance potential offered by specialized hardware architectures using HLS

arXiv.org e-Print Archive

Repository for Publications and Research Data

A highly parameterized and efficient FPGA-based skeleton for pairwise biological sequence alignment

Author: Benkrid Abdsamad
Benkrid K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2009
Field of study

Portsmouth University Research Portal (Pure)