Search CORE

525 research outputs found

PROGRESS white papers 2006:embedded systems design, networks and connected systems, verification and validation, networks on chip

Author: Corporaal H.
Niemegeers I.G.M.M.
Vaandrager F.W.
Publication venue: STW Technology Foundation
Publication date: 01/01/2006
Field of study

Repository TU/e

Pure OAI Repository

The effect of process switches on branch prediction accuracy

Author: Corporaal H.
Kisuki T.
Knijnenburg P.M.W.
Publication venue
Publication date: 01/01/2000
Field of study

Repository TU/e

Pure OAI Repository

Predicting implementation accuracy for real-time control systems

Author: Corporaal H.
Florescu O.
Voeten J.P.M.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2005
Field of study

Repository TU/e

Pure OAI Repository

Inter-tile reuse optimization applied to bandwidth constrained embedded accelerators

Author: Corporaal H.
Mesman B.
Peemen M.C.J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

The adoption of High-Level Synthesis (HLS) tools has significantly reduced accelerator design time. A complex scaling problem that remains is the data transfer bottleneck. To scale-up performance accelerators require huge amounts of data, and are often limited by interconnect resources. In addition, the energy spent by the accelerator is often dominated by the transfer of data, either in the form of memory references or data movement on interconnect. In this paper we drastically reduce accelerator communication by exploration of computation reordering and local buffer usage. Consequently, we present a new analytical methodology to optimize nested loops for inter-tile data reuse with loop transformations like interchange and tiling. We focus on embedded accelerators that can be used in a multi-accelerator System on Chip (SoC), so performance, area, and energy are key in this exploration. 1) On three common embedded applications in the image/video processing domain (demosaicing, block matching, object detection), we show that our methodology reduces data movement up to 2.1x compared to the best case of intra-tile optimization. 2) We demonstrate that our small accelerators (1-3% FPGA resources) can boost a simple MicroBlaze soft-core to the performance level of a high-end Intel-i7 processor

Repository TU/e

Crossref

Pure OAI Repository

Data- and task parallel image processing on a mixed SIMD-ILP platform using skeletons and asynchronous RPC

Author: Caarls W.
Corporaal H.
Jonker P.P.
Publication venue: STW Technology Foundation
Publication date: 01/01/2004
Field of study

Pure OAI Repository

Instruction-set architecture exploration of VLIW ASIPs using a genetic algorithm

Author: Corporaal H.
Jordans R.
Jozwiak L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Genetic algorithms are commonly used for automatically solving complex design problem because exploration using genetic algorithms can consistently deliver good results when the algorithm is given a long enough run-time. However, the exploration time for problems with huge design spaces can be very long, often making exploration using a genetic algorithm practically infeasible. In this work, we present a genetic algorithm for exploring the instruction-set architecture of VLIW ASIPs and demonstrate its effectiveness by comparing it to two heuristic algorithms. We present several optimizations to the genetic algorithm configuration, and demonstrate how caching of intermediate compilation and simulation results can reduce the exploration time by an order of magnitude

Crossref

Pure OAI Repository

A data-reuse aware accelerator for large-scale convolutional networks

Author: Corporaal H.
Mesman B.
Peemen M.C.J.
Publication venue
Publication date: 01/01/2014
Field of study

This paper presents a clustered SIMD accelerator template for Convolutional Networks. These networks significantly outperform other methods in detection and classification tasks in the vision domain. Due to the excessive compute and data transfer requirements these applications benefit a lot from a dedicated accelerator. The proposed accelerator reduces memory traffic by loop transformations such as tiling and fusion to merge successive layers. Although fusion can introduce redundant computations it often reduces the data transfer, and therefore can remove performance bottlenecks. The SIMD cluster is mapped to a Xilinx Zynq FPGA, which can achieve 6.4 Gops performance with a small amount of resources. The performance can be scaled by using multiple clusters

CiteSeerX

Repository TU/e

Pure OAI Repository

Closed-Loop Evaluation of an Embedded Visual Servo System

Author: Corporaal H.
Jonker P.P.
Nijmeijer H.
Ye Z.
Publication venue
Publication date: 01/01/2012
Field of study

No abstract

Pure OAI Repository

A data-reuse aware accelerator for large-scale convolutional networks

Author: Corporaal H.
Mesman B.
Peemen M.C.J.
Publication venue
Publication date: 01/01/2014
Field of study

Pure OAI Repository

Property-preserving synthesis for unified conrol- and data-oriented models.

Author: Corporaal H.
Florescu O.
Voeten J.P.M.
Publication venue: Springer
Publication date: 01/01/2006
Field of study

In the software/hardware engineering model-driven design methodology, preservation of real-time system properties can be guaranteed in the model synthesis up to a small time-deviation. Therefore, this methodology is well suited for the design and implementation of control systems in which execution times of actions are small; thus the time-deviations obtained are small. However, in systems containing time-intensive computations, the time-deviations become large and, consequently, the real-time properties are much weakened. This chapter proposes an approach for obtaining stronger preservation of the observable properties of the system by abstracting from its internal unobservable actions. In this way, a unified way of analysis and synthesis of a larger area of real-time applications can be obtained, which enables designers to achieve predictability in the design of many systems

Repository TU/e

Pure OAI Repository