Search CORE

Digitale Bibliothek Braunschweig

Hauptsätze der Differential- und Integral-Rechnung : als Leitfaden zum Gebrauch bei Vorlesungen / zusammengestellt von Robert Fricke ; 1. Theil

Author: Awan Ahsan Javed
Boonstra Albert Jan
Chelini L
Corda S
Corporaal H Henk
Jordans R Roel
Singh G
Stuijk Sander Sander
Publication venue: Vieweg
Publication date: 01/01/1897
Field of study

\u3cp\u3eThe conventional approach of moving data to the CPU for computation has become a significant performance bottleneck for emerging scale-out data-intensive applications due to their limited data reuse. At the same time, the advancement in 3D integration technologies has made the decade-old concept of coupling compute units close to the memory — called near-memory computing (NMC) — more viable. Processing right at the “home” of data can significantly diminish the data movement problem of data-intensive applications. In this paper, we survey the prior art on NMC across various dimensions (architecture, applications, tools, etc.) and identify the key challenges and open issues with future research directions. We also provide a glimpse of our approach to near-memory computing that includes i) NMC specific microarchitecture independent application characterization ii) a compiler framework to offload the NMC kernels on our target NMC platform and iii) an analytical model to evaluate the potential of NMC.\u3c/p\u3

arXiv.org e-Print Archive

Heuristics for scenario creation to enable general loop transformations

Author: Catthoor F
Corporaal H Henk
Palkovic M
Publication venue
Publication date: 01/01/2007
Field of study

Embedded system applications can have quite complex control flow graphs (CFGs). Often their control flow prohibits design time optimizations, like advanced global loop transformations. To solve this problem, and enable far more global optimizations, we could consider paths of the CFG in isolation. However coding all paths separately would cause a tremendous code copying. In practice we have to trade-off the extra optimization opportunities vs. the code size. To make this trade-off, in this paper we use so-called system scenarios. These scenarios bundle similar control paths, while still allowing sufficient optimizations. The problem treated in this paper is: what are the right scenarios; i.e., which paths should be grouped together. For complex CFGs the number of possible scenarios (ways of grouping CFG paths) is huge; it grows exponentially with the number of CFG paths. Therefore heuristics are needed to quickly discover reasonable groupings. The main contribution of this paper is that we propose and evaluate three of these heuristics on both synthetic benchmarks and on a real-life application

Cross-domain modeling and optimization of high-speed visual servo systems

Author: Corporaal H Henk
Jonker PP Pieter
Nijmeijer H Henk
Ye Z Zhenyu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

High-speed visual servo systems are used in an increasing number of applications. Yet modeling and optimizing these systems remains a research challenge, largely because these systems consist of tightly-coupled design parameters across multiple domains, including image sensors, vision algorithms, processing systems, mechanical systems, control systems, among others. To overcome such a challenge, this work applies an axiomatic design method to the design of high-speed visual servo systems, such that cross-domain couplings are explicitly modeled and subsequently eliminated when possible. More importantly, methods are proposed to model the sample rate, measurement error, and delay of visual feedback based on design parameters across multiple domains. Lastly, methods to construct a holistic model and to perform cross-domain optimization are proposed. The proposed methods are applied to a representative case study that demonstrates the necessity of cross-domain modeling and optimization, as well as the effectiveness of the proposed methods

High-level synthesis of massively parallel vision architectures for 100000 frames-per-second visual servo control

Author: Corporaal H Henk
Jonker PP Pieter
Nijmeijer H Henk
Ye Z Zhenyu
Publication venue
Publication date: 01/01/2013
Field of study

Designing energy efficient approximate multipliers for neural acceleration

Author: Corporaal H Henk
De S
Huisken JA Jos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

\u3cp\u3eMany error resilient applications can be approximated using multi-layer perceptrons (MLPs) with insignificant degradation in output quality. Faster and energy efficient execution of such an application is achieved using a neural accelerator (NA). This work exploits the error resilience characteristics of a MLP by approximating the accelerator itself. An error resilience analysis of the MLP is performed to obtain key constraints which are used for designing energy efficient approximate multipliers. A systematic methodology for the design of approximate multipliers is used. A graph based netlist modification approach is considered. Approximate versions of basic standard cells are generated and these are used to replace accurate cells in the synthesized netlist in a systematic quality controlled manner. These approximate multipliers are further used for approximating the multiply and accumulate (MAC) units in the neural accelerator (NA). The results are validated by considering approximate neural replication of a robotic application, inversek2j. System level energy savings of upto 14% is obtained for less the 7% degradation in output quality. Average application speedup of 24% is obtained over accurate neural accelerator (NA). The results are compared with state-of-the-art approximate multipliers and a comparison with truncation (bit-wise scaling) is performed. Moreover, error healing capability of MLPs is shown by studying the impact of retraining on networks with approximate multipliers.\u3c/p\u3

Quantization of constrained processor data paths applied to convolutional neural networks

Author: Bruin E de
Corporaal H Henk
Zivkovic Zoran
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

\u3cp\u3eArtificial Neural Networks (NNs) can effectively be used to solve many classification and regression problems, and deliver state-of-the-art performance in the application domains of natural language processing (NLP) and computer vision (CV). However, the tremendous amount of data movement and excessive convolutional workload of these networks hampers large-scale mobile and embedded productization. Therefore these models are generally mapped to energy-efficient accelerators without floating-point support. Weight and data quantization is an effective way to deploy high-precision models to efficient integer-based platforms. In this paper a quantization method for platforms without wide accumulation registers is being proposed. Two constraints to maximize the bit width of weights and input data for a given accumulator size are introduced. These constraints exploit knowledge about the weight and data distribution of individual layers. Using these constraints, we propose a layer-wise quantization heuristic to find a good fixed-point network approximation. To reduce the number of configurations to consider, only solutions that fully utilize the available accumulator bits are being tested. We demonstrate that 16-bit accumulators are able to obtain a Top-1 classification accuracy within 1% of the floating-point baselines on the CIFAR-10 and ILSVRC2012 image classification benchmarks.\u3c/p\u3

An automated approximation methodology for arithmetic circuits

Author: Corporaal H Henk
De S
Huisken JA Jos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

\u3cp\u3eArithmetic circuits like adders and multipliers are key workforces of many error resilient applications. Prior efforts on approximating these arithmetic circuits mainly focused on manual circuit level functional modifications. These manual approaches need high design time and effort. Due to this only a limited no. of approximate design points can be generated from the original circuit leading to a sparsely occupied pareto front. This work proposes an automated approximation methodology for arithmetic circuits. Proposed method approximates the gate level standard cell library and uses these approximate standard cells to modify the netlist of the original circuit. A heuristic design space exploration methodology is proposed to speed-up the design process. We integrate this methodology with traditional ASIC flow and validate our results using adders and multipliers of different bitwidths. We show that our methodology improves on existing state-of-the-art manual as well as automated design techniques by generating non-dominant pareto-fronts. An application case study (sobel edge detection) is shown using approximate arithmetic circuits generated by our methodology. In case of sobel edge detector, we show upto 50% energy improvements for hardly any quality degradation (PSNR ≥ 20dB).\u3c/p\u3