Search CORE

5,890 research outputs found

ParaFPGA 2013: Harnessing Programs, Power and Performance in Parallel FPGA applications

Author: D'Hollander Erik
Stroobandt Dirk
Touhafi Abdellah
Publication venue: 'IOS Press'
Publication date: 01/01/2014
Field of study

Future computing systems will require dedicated accelerators to achieve high-performance. The mini-symposium ParaFPGA explores parallel computing with FPGAs as an interesting avenue to reduce the gap between the architecture and the application. Topics discussed are the power of functional and dataflow languages, the performance of high-level synthesis tools, the automatic creation of hardware multi-cores using C-slow retiming, dynamic power management to control the energy consumption, real-time reconfiguration of streaming image processing filters and memory optimized event image segmentation

Ghent University Academic Bibliography

Workload-aware Automatic Parallelization for Multi-GPU DNN Training

Author: Choi Jungwook
Jo Youngmin
Shin Sungho
Srinivasan Vijayalakshmi
Sung Wonyong
Venkataramani Swagath
Publication venue
Publication date: 06/02/2019
Field of study

Deep neural networks (DNNs) have emerged as successful solutions for variety of artificial intelligence applications, but their very large and deep models impose high computational requirements during training. Multi-GPU parallelization is a popular option to accelerate demanding computations in DNN training, but most state-of-the-art multi-GPU deep learning frameworks not only require users to have an in-depth understanding of the implementation of the frameworks themselves, but also apply parallelization in a straight-forward way without optimizing GPU utilization. In this work, we propose a workload-aware auto-parallelization framework (WAP) for DNN training, where the work is automatically distributed to multiple GPUs based on the workload characteristics. We evaluate WAP using TensorFlow with popular DNN benchmarks (AlexNet and VGG-16), and show competitive training throughput compared with the state-of-the-art frameworks, and also demonstrate that WAP automatically optimizes GPU assignment based on the workload's compute requirements, thereby improving energy efficiency.Comment: This paper is accepted in ICASSP201

arXiv.org e-Print Archive

Crossref

SNU Open Repository and Archive

Improving the scalability of parallel N-body applications with an event driven constraint based execution model

Author: Aarseth SJ
Alfieri RA
Bonachea D
Chandra R
Dekate C
El-Ghazawi T
Hewitt C
Kale L
Message Passing Interface Forum
O’Shea BW
Salmon JK
Singh JP
Publication venue: 'SAGE Publications'
Publication date: 23/09/2011
Field of study

The scalability and efficiency of graph applications are significantly constrained by conventional systems and their supporting programming models. Technology trends like multicore, manycore, and heterogeneous system architectures are introducing further challenges and possibilities for emerging application domains such as graph applications. This paper explores the space of effective parallel execution of ephemeral graphs that are dynamically generated using the Barnes-Hut algorithm to exemplify dynamic workloads. The workloads are expressed using the semantics of an Exascale computing execution model called ParalleX. For comparison, results using conventional execution model semantics are also presented. We find improved load balancing during runtime and automatic parallelism discovery improving efficiency using the advanced semantics for Exascale computing.Comment: 11 figure

arXiv.org e-Print Archive

Crossref

Programs as Polypeptides

Author: Williams Lance R.
Publication venue
Publication date: 04/06/2015
Field of study

We describe a visual programming language for defining behaviors manifested by reified actors in a 2D virtual world that can be compiled into programs comprised of sequences of combinators that are themselves reified as actors. This makes it possible to build programs that build programs from components of a few fixed types delivered by diffusion using processes that resemble chemistry as much as computation.Comment: in European Conference on Artificial Life (ECAL '15), York, UK, 201

arXiv.org e-Print Archive

CiteSeerX

Crossref