22 research outputs found
HCM: Hardware-Aware Complexity Metric for Neural Network Architectures
Convolutional Neural Networks (CNNs) have become common in many fields
including computer vision, speech recognition, and natural language processing.
Although CNN hardware accelerators are already included as part of many SoC
architectures, the task of achieving high accuracy on resource-restricted
devices is still considered challenging, mainly due to the vast number of
design parameters that need to be balanced to achieve an efficient solution.
Quantization techniques, when applied to the network parameters, lead to a
reduction of power and area and may also change the ratio between communication
and computation. As a result, some algorithmic solutions may suffer from lack
of memory bandwidth or computational resources and fail to achieve the expected
performance due to hardware constraints. Thus, the system designer and the
micro-architect need to understand at early development stages the impact of
their high-level decisions (e.g., the architecture of the CNN and the amount of
bits used to represent its parameters) on the final product (e.g., the expected
power saving, area, and accuracy). Unfortunately, existing tools fall short of
supporting such decisions.
This paper introduces a hardware-aware complexity metric that aims to assist
the system designer of the neural network architectures, through the entire
project lifetime (especially at its early stages) by predicting the impact of
architectural and micro-architectural decisions on the final product. We
demonstrate how the proposed metric can help evaluate different design
alternatives of neural network models on resource-restricted devices such as
real-time embedded systems, and to avoid making design mistakes at early
stages
Using Value Prediction to Increase the Power of Speculative Execution Hardware
This paper presents an experimental and analytical study of value prediction and its impact on speculative execution in superscalar microprocessors. Value prediction is a new paradigm that suggests predicting outcome values of operations (at run-time) and using these predicted values to trigger the execution of true-data dependent operations speculatively. As a result, stalls to memory locations can be reduced and the amount of instruction-level parallelism can be extended beyond the limits of the program’s dataflow graph. This paper examines the characteristics of the value prediction concept from two perspectives: 1. the related phenomena that are reflected in the nature of computer programs, and 2. the significance of these phenomena to boosting instruction-level parallelism of super-scalar microprocessors that support speculative execution. In order to better understand these characteristics, our work combines both analytical and experimental studies.
Electromigration-Aware Memory Hierarchy Architecture
New mission-critical applications, such as autonomous vehicles and life-support systems, set a high bar for the reliability of modern microprocessors that operate in highly challenging conditions. However, while cutting-edge integrated circuit (IC) technologies have intensified microprocessors by providing remarkable reductions in the silicon area and power consumption, they also introduce new reliability challenges through the complex design rules they impose, creating a significant hurdle in the design process. In this paper, we focus on electromigration (EM), which is a crucial factor impacting IC reliability. EM refers to the degradation process of IC metal nets when used for both power supply and interconnecting signals. Typically, EM concerns have been addressed at the backend, circuit, and layout levels, where EM rules are enforced assuming extreme conditions to identify and resolve violations. This study presents new techniques that leverage architectural features to mitigate the effect of EM on the memory hierarchy of modern microprocessors. Architectural approaches can reduce the complexity of solving EM-related violations, and they can also complement and enhance common existing methods. In this study, we present a comprehensive simulation analysis that demonstrates how the proposed solution can significantly extend the lifetime of a microprocessor’s memory hierarchy with minimal overhead in terms of performance, power, and area while relaxing EM design efforts
Electromigration-Aware Architecture for Modern Microprocessors
Reliability is a fundamental requirement in microprocessors that guarantees correct execution over their lifetimes. The reliability-related design rules depend on the process technology and device operating conditions. To meet reliability requirements, advanced process nodes impose challenging design rules, which place a major burden on the VLSI implementation flow because they impose severe physical constraints. This paper focuses on electromigration (EM), one of the critical factors affecting semiconductor reliability. EM is the aging process of on-die wires in integrated circuits (ICs). Traditionally, EM issues have been handled at the physical design level, which enforces reliability rules using worst-case scenario analysis to detect and solve violations. In this paper, we offer solutions that exploit architectural characteristics to reduce EM impact. The use of architectural methods can simplify EM solutions, and such methods can be incorporated with standard physical-design-based solutions to enhance current methods. Our comprehensive physical simulation results show that, with minimal area, power, and performance overhead, the proposed solution can relax EM design efforts and significantly extend a microprocessor’s lifetime
The Effect of Instruction Fetch Bandwidth on Value Prediction
Value prediction attempts to eliminate true-data dependencies by dynamically predicting the outcome values of instructions and executing true-data dependent instructions based on that prediction. In this paper we attempt to understand the limitations of using this paradigm in realistic machines. We show that the instruction-fetch bandwidth and the issue rate have a very significant impact on the efficiency of value prediction. In addition, we study how recent techniques to improve the instruction-fetch rate affect the efficiency of value prediction and its hardware organization. 1. Introduction The fast growing density of gates on a silicon die, allows modern microprocessors to increasingly employ multiple execution units that are capable of executing several instructions in parallel. Most of the recent microprocessor architectures assume sequential programs as an input and a parallel execution model, where the hardware is expected to extract the parallelism at run-time out of the ins..