Search CORE

532,994 research outputs found

Recommended from our members

Learning-based system-level power modeling of hardware IPs

Author: Lee Dongwook
Publication venue
Publication date: 18/12/2017
Field of study

Accurate power models for hardware components at high levels of abstraction are a critical component to enable system-level power analysis and optimization. Virtual platform prototypes are widely utilized to support early system-level design space exploration. There is, however, a lack of accurate and fast power models of hardware components at such high-levels of abstraction. In this dissertation, we present novel learning‑based approaches for extending fast functional simulation models of white-, gray-, and black-box custom hardware intellectual property components (IPs) with accurate power estimates. Depending on the observability, we extend high-level functional models with the capability to capture data-dependent resource, block, or I/O activity without a significant loss in simulation speed. We further leverage state-of-the-art machine learning techniques to synthesize abstract power models that can predict cycle-, block-, and invocation-level power from low-level hardware implementations, where we introduce novel structural decomposition techniques to reduce model complexities and increase estimation accuracy. Our white-box approach integrates with existing high-level synthesis (HLS) tools to automatically extract resource mapping information, which is used to trace data-dependent resource-level activity and drive a cycle-accurate online power-performance model during functional simulation. Our gray-box approach supports power estimation at coarser basic block granularity. It uses only limited information about block inputs and outputs to extract light-weight block-level activity from a functional simulation and drive a basic block-level power model that utilizes a control flow decomposition to improve accuracy and speed. It is faster than cycle-level models, while providing a finer granularity than invocation-level models, which allows to further navigate accuracy and speed trade-offs. We finally propose a novel approach for extending behavioral models of black-box hardware IPs with an invocation-level power estimate. Our black-box model only uses input and output history to track data-dependent pipeline behavior, where we introduce a specialized ensemble learning that is composed out of individually selected cycle-by-cycle models with reduced complexity and increased accuracy. The proposed approaches are fully automated by integrating with existing, commercial HLS tools for custom hardware synthesized by HLS. Results of applying our approaches to various industrial‑strength design examples show that our power models can predict cycle‑, basic block-, and invocation-level power consumption to within 10%, 9%, and 3% of a commercial gate-level power estimation tool, respectively, all while running at several order of magnitude faster speeds of 1-10Mcycles/sec.Electrical and Computer Engineerin

Texas ScholarWorks

Optimierung der Energie und Power getriebenen Architekturexploration für Multicore und heterogenes System on Chip

Author: Shah Syed Abbas Ali
Publication venue
Publication date: 01/01/2018
Field of study

The contribution of this work builds on top of the established virtual prototype platforms to improve both SoC design quality and productivity. Initially, an automatic system-level power estimation framework was developed to address the critical issue of early power estimation in SoC design. The estimation framework models the static and dynamic power consumption of the hardware components. These models are created from the normalized values of the basic design components of SoC, obtained through one-time power simulation of RTL hardware models. The framework allows dynamic technology node reconfiguration for power estimation models. Its instantaneous power reporting aids the detection of possible hotspot early into the design process. Adding this additional data in conjunction with a steadily growing design space of complex heterogeneous SoC, finding the right parameter configuration is a challenging and laborious task for a system-level designer. This work addresses this bottleneck by optimizing the design space exploration (DSE) process for MPSoC design. An automatic DSE framework for virtual platforms (VPs) was developed which is flexible and allows the selection optimal parameter configuration without pre-existing knowledge. To reduce exploration time, the framework is equipped with several multi-objective optimization techniques based on simulated annealing and a genetic algorithm. Lastly, to aid HW/SW partitioning at system-level, a flexible and automated workflow (SW2TLM) is presented. It allows the designer to explore various possible partitioning scenarios without going into depth of the hardware architecture complexity and software integration. The framework generates system-level hardware accelerators from corresponding functionality encoded in the software code and integrates them into the VP. Power consumption and time speedups of acceleration is reported to the designer, which further increases the quality and productivity of the development process towards the final architecture. The presented tools are evaluated using a state-of-the-art VP for a range of single and multi-core applications. Viewing the energy delay product, a reduction in exploration time was recorded at approximately 62% (worst case), maintaining optimal parameter accuracy of 90% compared to previous techniques. While the SW2TLM further increases the exploration versatility by combining modern high-level synthesis with system-level architectural exploration.Der Beitrag dieser Arbeit baut auf dem etablierten Konzept der virtuellen Prototyp (VP) Plattformen auf, um die Qualität und die Produktivität des Entwurfsprozesses zu verbessern. Zunächst wurde ein automatisches System-Level-Framework entwickelt, um Verlustleistungsabschätzung für SoC-Designs in einer deutlich früheren Entwicklungsphase zu ermöglichen. Hierfür werden statischen und dynamischen Energieverbrauchsanteile individueller Hardwareelemente durch ein abstraktes Modell ausgedrückt. Das Framework ermöglicht eine dynamische Anpassung des Technologieknotens sowie die Integration neuer Leistungsmodelle für Drittanbieterkomponenten. Die kontinuierliche Erfassung der Energieverbrauchseigenschaften und ihre grafische Darstellung Benutzeroberfläche unterstützt zusätzlich die frühzeitige Identifikation möglicher Hotspots. Durch die Bereitstellung zusätzlicher Daten, in Verbindung mit einem stetig wachsenden Entwurfsraum komplexer SoCs, ist die Identifikation der richtigen Parameterkonfiguration eine zeitintensive Aufgabe. Die vorgelegten Konzepte erlauben eine gesteigerte Automatisierung des Explorationsprozesses. Techniken der mehrdimensionalen Optimierung, basierend auf Simulated Annealing und genetischer Algorithmen erlauben die Identifikation von geeigneten Konfigurationen ohne vorheriges Wissen oder Erfahrungswerte Schließlich wurde zur Unterstützung der HW/SW -Partitionierung auf System-Ebene ein flexibler und automatisierter Workflow entwickelt. Er ermöglicht es dem Designer verschiedene mögliche Partitionierungsszenarien zu untersuchen, ohne sich in die Komplexität der Hardwarearchitektur und der Softwareintegration zu vertiefen. Das Framework erzeugt abstrakte Beschleunigermodelle aus entsprechenden Softwarefunktionen und integriert sie nahtlos in den ausführbare VP. Detaillierte Daten zum Energieverbrauch, Beschleunigungsfaktor und Kommunikationsoverhead der Partitionierung werden erfasst und dem Designer zur Verfügung gestellt, was die Qualität und Produktivität des weiter erhöht. Die vorgestellten Tools werden mit einer modernen VP für verschiedene SW-Anwendungen evaluiert. Bei Betrachtung des Energieverzögerungsprodukts wurde eine Verringerung der Explorationszeit um mehr als 62% bei 90% Parametergenauigkeit festgestell. Darauf aufbauend, erleichtert die automatisierte Untersuchung verschiedener HW/SW Partitionierungen die Entwicklung heterogener Architekturen durch die Kombination moderner HLS mit Architektur-Exploration auf der Systemebene

Digitale Bibliothek Braunschweig

Operand Value Based Modeling and Estimation of Dynamic Energy Consumption of Soft Processors in FPGA

Author: Al-Khatib Zaid A. M.
Publication venue
Publication date: 15/04/2013
Field of study

This thesis presents a novel method for estimating the dynamic energy consumption of soft processors in FPGA, using an operand-value-based model. The processor energy model is created at the instruction-level, which enables fast, early and accurate energy estimation. The modeling heuristic is based on the observation that the energy required to execute instructions on an FPGA implementation of a soft processor has a strong dependence on the operand values. Our energy model contains three components: the instruction base energy, the maximum variation in the instruction energy due to input data, and the impact of one’s density of the operand values during software execution. The one’s density refers to the number of operand bits that are set to one. We use post-place and route processor simulations as a reference to evaluate the accuracy of our model, and that of other existing instruction-level energy models, for several benchmarks. We demonstrate that our model has only 4.7% average error and 12% worst case error compared to the reference, and is more than twice as accurate as existing instruction-level models. Key Words: Energy modeling, Soft processors, system-level design, Power estimation

Concordia University Research Repository

Architectural level delay and leakage power modelling of manufacturing process variation

Author: Ni Chenxi
Publication venue: Newcastle University
Publication date: 01/01/2013
Field of study

PhD ThesisThe effect of manufacturing process variations has become a major issue regarding the estimation of circuit delay and power dissipation, and will gain more importance in the future as device scaling continues in order to satisfy market place demands for circuits with greater performance and functionality per unit area. Statistical modelling and analysis approaches have been widely used to reflect the effects of a variety of variational process parameters on system performance factor which will be described as probability density functions (PDFs). At present most of the investigations into statistical models has been limited to small circuits such as a logic gate. However, the massive size of present day electronic systems precludes the use of design techniques which consider a system to comprise these basic gates, as this level of design is very inefficient and error prone. This thesis proposes a methodology to bring the effects of process variation from transistor level up to architectural level in terms of circuit delay and leakage power dissipation. Using a first order canonical model and statistical analysis approach, a statistical cell library has been built which comprises not only the basic gate cell models, but also more complex functional blocks such as registers, FIFOs, counters, ALUs etc. Furthermore, other sensitive factors to the overall system performance, such as input signal slope, output load capacitance, different signal switching cases and transition types are also taken into account for each cell in the library, which makes it adaptive to an incremental circuit design. The proposed methodology enables an efficient analysis of process variation effects on system performance with significantly reduced computation time compared to the Monte Carlo simulation approach. As a demonstration vehicle for this technique, the delay and leakage power distributions of a 2-stage asynchronous micropipeline circuit has been simulated using this cell library. The experimental results show that the proposed method can predict the delay and leakage power distribution with less than 5% error and at least 50,000 times faster computation time compare to 5000-sample SPICE based Monte Carlo simulation. The methodology presented here for modelling process variability plays a significant role in Design for Manufacturability (DFM) by quantifying the direct impact of process variations on system performance. The advantages of being able to undertake this analysis at a high level of abstraction and thus early in the design cycle are two fold. First, if the predicted effects of process variation render the circuit performance to be outwith specification, design modifications can be readily incorporated to rectify the situation. Second, knowing what the acceptable limits of process variation are to maintain design performance within its specification, informed choices can be made regarding the implementation technology and manufacturer selected to fabricate the design

Newcastle University eTheses

Fast Power and Energy Efficiency Analysis of FPGA-based Wireless Base-band Processing

Author: Hélard Maryline
Jordane Lorandel
Prévotet Jean-Christophe
Publication venue
Publication date: 05/01/2016
Field of study

Nowadays, demands for high performance keep on increasing in the wireless communication domain. This leads to a consistent rise of the complexity and designing such systems has become a challenging task. In this context, energy efficiency is considered as a key topic, especially for embedded systems in which design space is often very constrained. In this paper, a fast and accurate power estimation approach for FPGA-based hardware systems is applied to a typical wireless communication system. It aims at providing power estimates of complete systems prior to their implementations. This is made possible by using a dedicated library of high-level models that are representative of hardware IPs. Based on high-level simulations, design space exploration is made a lot faster and easier. The definition of a scenario and the monitoring of IP's time-activities facilitate the comparison of several domain-specific systems. The proposed approach and its benefits are demonstrated through a typical use case in the wireless communication domain.Comment: Presented at HIP3ES, 201

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL-Rennes 1

Recommended from our members

Data-dependent cycle-accurate power modeling of RTL-level IPs using machine learning

Author: Srour Malek
Publication venue
Publication date: 07/08/2018
Field of study

In a chip design project, early design planning has a strong impact on the schedule and the cost of design. Power estimation is part of early design planning, and it greatly affects design decisions. Power modeling performed at a high level of abstraction is fast but inaccurate due to lack of circuit switching activity information. By contrast, power modeling performed at a low level of abstraction is more accurate as the synthesized circuit synthesis is known, but this simulation is typically slow. This report explores a power modeling approach performed at register transfer level (RTL). It exploits machine learning models in order to have a fast yet relatively accurate cycle-by-cycle power estimation. The approach is data-dependent, where cycle-specific models are trained based on the switching activity of signals obtained from RTL simulation and cycle-by-cycle power values obtained from a reference gate-level simulation of an existing RTL design. Therefore, if any changes are applied to the RTL design, re-training of models is required. The approach aims at obtaining fast yet accurate power predictions for new invocations of a given trained model using signal activity information collected during simulation of the unmodified RTL. At a low level, the complete visibility of signals in a design unintuitively might cause overtraining the model leading to inaccurate estimation. The suggested model employs automatic feature selection in each cycle. Based on the invocations used to train the cycle-by-cycle models, only signals that may switch during a given cycle will be selected as the features for their respective cycle-specific model. The method was tested on an 8-by-8 DCT design and the power estimates were within 6.5% of those from a commercial power analysis tool. This report also simulates and compares the approach of cycle-specific models to the approach of a single global model for all cycles and show that the cycle-specific approach is twice as accurate.Electrical and Computer Engineerin

Texas ScholarWorks

PresenceSense: Zero-training Algorithm for Individual Presence Detection based on Power Monitoring

Author: Jia Ruoxi
Jin Ming
Kang Zhoayi
Konstantakopoulos Ioannis C.
Spanos Costas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/07/2014
Field of study

Non-intrusive presence detection of individuals in commercial buildings is much easier to implement than intrusive methods such as passive infrared, acoustic sensors, and camera. Individual power consumption, while providing useful feedback and motivation for energy saving, can be used as a valuable source for presence detection. We conduct pilot experiments in an office setting to collect individual presence data by ultrasonic sensors, acceleration sensors, and WiFi access points, in addition to the individual power monitoring data. PresenceSense (PS), a semi-supervised learning algorithm based on power measurement that trains itself with only unlabeled data, is proposed, analyzed and evaluated in the study. Without any labeling efforts, which are usually tedious and time consuming, PresenceSense outperforms popular models whose parameters are optimized over a large training set. The results are interpreted and potential applications of PresenceSense on other data sources are discussed. The significance of this study attaches to space security, occupancy behavior modeling, and energy saving of plug loads.Comment: BuildSys 201

arXiv.org e-Print Archive

eScholarship - University of California