Search CORE

132 research outputs found

LEGaTO: first steps towards energy-efficient toolset for heterogeneous computing

Author: Alvarez Carlos
Bautista Leonardo
Becker Tobias
Billung-Meyer Gunnar
Carpenter Paul
Christmann Wolfgang
Cristal Adrian
De La Cruz Raul
Dubhashi Devdatt
Etsion Yoav
Felber Pascal
Fetzer Christof
Gaydadjiev Georgi
Göttel Christian
Hadar Elad
Hagemeyer Jens
Jimenez Daniel
Jungeblut Thorsten
Kaiser Martin
Klawonn Frank
Krupop Stefan
Kucza Nils
Madonar Sergi
Martorell Xavier
Mihklafi Amani
Mudge Trevor
Mudge Trevor
Pasin Marcelo
Pericàs Miquel
Pnevmatikatos Dionisios N.
Porrmann Mario
Port Oron
Rocha Isabelly
Salami Behzad
Salomonsson Hans
Schiavoni Valerio
Trancoso Pedro
Unsal Osman S.
vor dem Berge Micha
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

LEGaTO is a three-year EU H2020 project which started in December 2017. The LEGaTO project will leverage task-based programming models to provide a software ecosystem for Made-in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC.Peer ReviewedPostprint (author's final draft

Chalmers Research

Publications at Bielefeld University

Optimized FPGA Implementation of Model Predictive Control for Embedded Systems Using High-Level Synthesis Tool

Author: Findeisen R.
Lucia O.
Lucia S.
Navarro D.
Zometa P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Model predictive control (MPC) is an optimization-based strategy for high-performance control that is attracting increasing interest. While MPC requires the online solution of an optimization problem, its ability to handle multivariable systems and constraints makes it a very powerful control strategy specially for MPC of embedded systems, which have an ever increasing amount of sensing and computation capabilities. We argue that the implementation of MPC on field programmable gate arrays (FPGAs) using automatic tools is nowadays possible, achieving cost-effective successful applications on fast or resource-constrained systems. The main burden for the implementation of MPC on FPGAs is the challenging design of the necessary algorithms. We outline an approach to achieve a software-supported optimized implementation of MPC on FPGAs using high-level synthesis tools and automatic code generation. The proposed strategy exploits the arithmetic operations necessaries to solve optimization problems to tailor an FPGA design, which allows a tradeoff between energy, memory requirements, cost, and achievable speed. We show the capabilities and the simplicity of use of the proposed methodology on two different examples and illustrate its advantages over a microcontroller implementation

Automated optimization of reconfigurable designs

Author: Kurek Maciej
Publication venue: Computing, Imperial College London
Publication date: 01/03/2016
Field of study

Currently, the optimization of reconfigurable design parameters is typically done manually and often involves substantial amount effort. The main focus of this thesis is to reduce this effort. The designer can focus on the implementation and design correctness, leaving the tools to carry out optimization. To address this, this thesis makes three main contributions. First, we present initial investigation of reconfigurable design optimization with the Machine Learning Optimizer (MLO) algorithm. The algorithm is based on surrogate model technology and particle swarm optimization. By using surrogate models the long hardware generation time is mitigated and automatic optimization is possible. For the first time, to the best of our knowledge, we show how those models can both predict when hardware generation will fail and how well will the design perform. Second, we introduce a new algorithm called Automatic Reconfigurable Design Efficient Global Optimization (ARDEGO), which is based on the Efficient Global Optimization (EGO) algorithm. Compared to MLO, it supports parallelism and uses a simpler optimization loop. As the ARDEGO algorithm uses multiple optimization compute nodes, its optimization speed is greatly improved relative to MLO. Hardware generation time is random in nature, two similar configurations can take vastly different amount of time to generate making parallelization complicated. The novelty is efficient use of the optimization compute nodes achieved through extension of the asynchronous parallel EGO algorithm to constrained problems. Third, we show how results of design synthesis and benchmarking can be reused when a design is ported to a different platform or when its code is revised. This is achieved through the new Auto-Transfer algorithm. A methodology to make the best use of available synthesis and benchmarking results is a novel contribution to design automation of reconfigurable systems.Open Acces

Spiral - Imperial College Digital Repository

FPGA acceleration of DNA sequence alignment: design analysis and optimization

Author: Ng Ho-Cheung
Publication venue: Computing, Imperial College London
Publication date: 01/01/2022
Field of study

Existing FPGA accelerators for short read mapping often fail to utilize the complete biological information in sequencing data for simple hardware design, leading to missed or incorrect alignment. In this work, we propose a runtime reconfigurable alignment pipeline that considers all information in sequencing data for the biologically accurate acceleration of short read mapping. We focus our efforts on accelerating two string matching techniques: FM-index and the Smith-Waterman algorithm with the affine-gap model which are commonly used in short read mapping. We further optimize the FPGA hardware using a design analyzer and merger to improve alignment performance. The contributions of this work are as follows. 1. We accelerate the exact-match and mismatch alignment by leveraging the FM-index technique. We optimize memory access by compressing the data structure and interleaving the access with multiple short reads. The FM-index hardware also considers complete information in the read data to maximize accuracy. 2. We propose a seed-and-extend model to accelerate alignment with indels. The FM-index hardware is extended to support the seeding stage while a Smith-Waterman implementation with the affine-gap model is developed on FPGA for the extension stage. This model can improve the efficiency of indel alignment with comparable accuracy versus state-of-the-art software. 3. We present an approach for merging multiple FPGA designs into a single hardware design, so that multiple place-and-route tasks can be replaced by a single task to speed up functional evaluation of designs. We first experiment with this approach to demonstrate its feasibility for different designs. Then we apply this approach to optimize one of the proposed FPGA aligners for better alignment performance.Open Acces

Spiral - Imperial College Digital Repository

FASTER: Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration

Author: Becker T.
Brokalakis A.
Bruneel K.
Böhm P.
Ciobanu C.
Davidson T.
Gaydadjiev G.
Heyse K.
Luk W.
Niu X.
Papadimitriou K.
Papaefstathiou I.
Pau D.
Pell O.
Pilato C.
Pnevmatikatos D
Santambrogio M.D.
Sciuto D.
Stroobandt D.
Todman T.
Vansteenkiste E.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) EU FP7 project, aims to ease the design and implementation of dynamically changing hardware systems. Our motivation stems from the promise reconfigurable systems hold for achieving high performance and extending product functionality and lifetime via the addition of new features that operate at hardware speed. However, designing a changing hardware system is both challenging and time-consuming. FASTER facilitates the use of reconfigurable technology by providing a complete methodology enabling designers to easily specify, analyze, implement and verify applications on platforms with general-purpose processors and acceleration modules implemented in the latest reconfigurable technology. Our tool-chain supports both coarse- and fine-grain FPGA reconfiguration, while during execution a flexible run-time system manages the reconfigurable resources. We target three applications from different domains. We explore the way each application benefits from reconfiguration, and then we asses them and the FASTER tools, in terms of performance, area consumption and accuracy of analysis

Archivio istituzionale della ricerca - Politecnico di Milano

Chalmers Research

Chalmers Publication Library