1,974 research outputs found
Recommended from our members
Memory-Based High-Level Synthesis Optimizations Security Exploration on the Power Side-Channel
High-level synthesis (HLS) allows hardware designers to think algorithmically and not worry about low-level, cycle-by-cycle details. This provides the ability to quickly explore the architectural design space and tradeoffs between resource utilization and performance. Unfortunately, security evaluation is not a standard part of the HLS design flow. In this article, we aim to understand the effects of memory-based HLS optimizations on power side-channel leakage. We use Xilinx Vivado HLS to develop different cryptographic cores, implement them on a Spartan-6 FPGA, and collect power traces. We evaluate the designs with respect to resource utilization, performance, and information leakage through power consumption. We have two important observations and contributions. First, the choice of resource optimization directive results in different levels of side-channel vulnerabilities. Second, the partitioning optimization directive can greatly compromise the hardware cryptographic system through power side-channel leakage due to the deployment of memory control logic. We describe an evaluation procedure for power side-channel leakage and use it to make best-effort recommendations about how to design more secure architectures in the cryptographic domain
From MARTE to Reconfigurable NoCs: A model driven design methodology
Due to the continuous exponential rise in SoC's design complexity, there is a critical need to find new seamless methodologies and tools to handle the SoC co-design aspects. We address this issue and propose a novel SoC co-design methodology based on Model Driven Engineering and the MARTE (Modeling and Analysis of Real-Time and Embedded Systems) standard proposed by Object Management Group, to raise the design abstraction levels. Extensions of this standard have enabled us to move from high level specifications to execution platforms such as reconfigurable FPGAs. In this paper, we present a high level modeling approach that targets modern Network on Chips systems. The overall objective: to perform system modeling at a high abstraction level expressed in Unified Modeling Language (UML); and afterwards, transform these high level models into detailed enriched lower level models in order to automatically generate the necessary code for final FPGA synthesis
Profile Guided Dataflow Transformation for FPGAs and CPUs
This paper proposes a new high-level approach for optimising field programmable gate array (FPGA) designs. FPGA designs are commonly implemented in low-level hardware description languages (HDLs), which lack the abstractions necessary for identifying opportunities for significant performance improvements. Using a computer vision case study, we show that modelling computation with dataflow abstractions enables substantial restructuring of FPGA designs before lowering to the HDL level, and also improve CPU performance. Using the CPU transformations, runtime is reduced by 43 %. Using the FPGA transformations, clock frequency is increased from 67MHz to 110MHz. Our results outperform commercial low-level HDL optimisations, showcasing dataflow program abstraction as an amenable computation model for highly effective FPGA optimisation
Module-per-Object: a Human-Driven Methodology for C++-based High-Level Synthesis Design
High-Level Synthesis (HLS) brings FPGAs to audiences previously unfamiliar to
hardware design. However, achieving the highest Quality-of-Results (QoR) with
HLS is still unattainable for most programmers. This requires detailed
knowledge of FPGA architecture and hardware design in order to produce
FPGA-friendly codes. Moreover, these codes are normally in conflict with best
coding practices, which favor code reuse, modularity, and conciseness.
To overcome these limitations, we propose Module-per-Object (MpO), a
human-driven HLS design methodology intended for both hardware designers and
software developers with limited FPGA expertise. MpO exploits modern C++ to
raise the abstraction level while improving QoR, code readability and
modularity. To guide HLS designers, we present the five characteristics of MpO
classes. Each characteristic exploits the power of HLS-supported modern C++
features to build C++-based hardware modules. These characteristics lead to
high-quality software descriptions and efficient hardware generation. We also
present a use case of MpO, where we use C++ as the intermediate language for
FPGA-targeted code generation from P4, a packet processing domain specific
language. The MpO methodology is evaluated using three design experiments: a
packet parser, a flow-based traffic manager, and a digital up-converter. Based
on experiments, we show that MpO can be comparable to hand-written VHDL code
while keeping a high abstraction level, human-readable coding style and
modularity. Compared to traditional C-based HLS design, MpO leads to more
efficient circuit generation, both in terms of performance and resource
utilization. Also, the MpO approach notably improves software quality,
augmenting parametrization while eliminating the incidence of code duplication.Comment: 9 pages. Paper accepted for publication at The 27th IEEE
International Symposium on Field-Programmable Custom Computing Machines, San
Diego CA, April 28 - May 1, 201
Separation logic for high-level synthesis
High-level synthesis (HLS) promises a significant shortening of the digital hardware design cycle by raising the abstraction level of the design entry to high-level languages such as C/C++. However, applications using dynamic, pointer-based data structures remain difficult to implement well, yet such constructs are widely used in software. Automated optimisations that leverage the memory bandwidth of dedicated hardware implementations by distributing the application data over separate on-chip memories and parallelise the implementation are often ineffective in the presence of dynamic data structures, due to the lack of an automated analysis that disambiguates pointer-based memory accesses. This thesis takes a step towards closing this gap. We explore recent advances in separation logic, a rigorous mathematical framework that enables formal reasoning about the memory access of heap-manipulating programs. We develop a static analysis that automatically splits heap-allocated data structures into provably disjoint regions. Our algorithm focuses on dynamic data structures accessed in loops and is accompanied by automated source-to-source transformations which enable loop parallelisation and physical memory partitioning by off-the-shelf HLS tools.
We then extend the scope of our technique to pointer-based memory-intensive implementations that require access to an off-chip memory. The extended HLS design aid generates parallel on-chip multi-cache architectures. It uses the disjointness property of memory accesses to support non-overlapping memory regions by private caches. It also identifies regions which are shared after parallelisation and which are supported by parallel caches with a coherency mechanism and synchronisation, resulting in automatically specialised memory systems. We show up to 15x acceleration from heap partitioning, parallelisation and the insertion of the custom cache system in demonstrably practical applications.Open Acces
Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review
The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER
- …