Search CORE

120 research outputs found

Aggressive and reliable high-performance architectures - techniques for thermal control, energy efficiency, and performance augmentation

Author: Ramesh Prem Kumar
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2011
Field of study

As more and more transistors fit in a single chip, consumers of the electronics industry continue to expect decline in cost-per-function. Advancements in process technology offer steady improvements in system performance. The improvements manifest themselves as shrinking area, faster circuits and improved battery life. However, this migration toward sub-micro/nano-meter technologies presents a new set of challenges as the system becomes extremely sensitive to any voltage, temperature or process variations. One approach to immunize the system from the adverse effects of these variations is to add sufficient safety margins to the operating clock frequency of the system. Clearly, this approach is overly conservative because these worst case scenarios rarely occur. But, process technology in nanoscale era has already hit the power and frequency walls. Regardless of any of these challenges, the present processors not only need to run faster, but also cooler and use lesser energy. At a juncture where there is no further improvement in clock frequency is possible, data dependent latching through Timing Speculation (TS) provides a silver lining. Timing speculation is a widely known method for realizing better-than-worst-case systems. TS is aggressive in nature, where the mechanism is to dynamically tune the system frequency beyond the worst-case limits obtained from application characteristics to enhance the performance of system-on-chips (SoCs). However, such aggressive tuning has adverse consequences that need to be overcome. Power dissipation, on-chip temperature and reliability are key issues that cannot be ignored. A carefully designed power management technique combined with a reliable, controlled, aggressive clocking not only attempts to constrain power dissipation within a limit, but also improves performance whenever possible. In this dissertation, we present a novel power level switching mechanism by redefining the existing voltage-frequency pairs. We introduce an aggressive yet reliable framework for energy efficient thermal control. We were able to achieve up to 40% speed-up compared to a base scheme without overclocking. We compare our method against different schemes. We observe that up to 75% Energy-Delay squared product (ED2) savings relative to base architecture is possible. We showcase the loss of efficiency in present chip multiprocessor systems due to excess power supplied, and propose Utilization-aware Task Scheduling (UTS) - a power management scheme that increases energy efficiency of chip multiprocessors. Our experiments demonstrate that UTS along with aggressive timing speculation squeezes out maximum performance from the system without loss of efficiency, and breaching power & thermal constraints. From our evaluation we infer that UTS improves performance by up to 12% due to aggressive power level switching and over 50% in ED2 savings compared to traditional power management techniques. Aggressive clocking systems having TS as their central theme operate at a clock frequency range beyond specified safe limits, exploiting the data dependence on circuit critical paths. However, the margin for performance enhancement is restricted due to extreme difference between short paths and critical paths. In this thesis, we show that increasing the lengths of short paths of the circuit increases the margin of TS, leading to performance improvement in aggressively designed systems. We develop Min-arc algorithm to efficiently add delay buffers to selected short paths while keeping down the area penalty. We show that by using our algorithm, it is possible to increase the circuit contamination delay by up to 30% without affecting the propagation delay, with moderate area overhead. We also explore the possibility of increasing short path delays further by relaxing the constraint on propagation delay, and achieve even higher performance. Overall, we bring out the inter-relationship between power, temperature and reliability of aggressively clocked systems. Our main objective is to achieve maximal performance benefits and improved energy efficiency within thermal constraints by effectively combining dynamic frequency scaling, dynamic voltage scaling and reliable overclocking. We provide solutions to improve the existing power management in chip multiprocessors to dynamically maximize system utilization and satisfy the power constraints within safe thermal limits

Digital Repository @ Iowa State University (ISU)

Online self-repair of FIR filters

Author: Benso Alfredo
Di Carlo Stefano
Di Natale Giorgio
Prinetto Paolo Ernesto
Publication venue: IEEE Computer Society
Publication date: 01/01/2003
Field of study

Chip-level failure detection has been a target of research for some time, but today's very deep-submicron technology is forcing such research to move beyond detection. Repair, especially self-repair, has become very important for containing the susceptibility of today's chips. This article introduces a self-repair-solution for the digital FIR filter, one of the key blocks used in DSPs

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Virtualisation of FPGA-Resources for Concurrent User Designs Employing Partial Dynamic Reconfiguration

Author: Genßler Paul Richard
Publication venue
Publication date: 12/03/2015
Field of study

Reconfigurable hardware in a cloud environment is a power efficient way to increase the processing power of future data centers beyond today\'s maximum. This work enhances an existing framework to support concurrent users on a virtualized reconfigurable FPGA resource. The FPGAs are used to provide a flexible, fast and very efficient platform for the user who has access through a simple cloud based interface. A fast partial reconfiguration is achieved through the ICAP combined with a PCIe connection and a combination of custom and TCL scripts to control the tool flow. This allows for a reconfiguration of a user space on a FPGA in a few milliseconds while providing a simple single-action interface to the user

Technische Universität Dresden: Qucosa

A Survey and Comparative Analysis of Security Properties of CAN Authentication Protocols

Author: Brighente Alessandro
Conti Mauro
Lotto Alessandro
Marchiori Francesco
Publication venue
Publication date: 19/01/2024
Field of study

The large number of Electronic Control Units (ECUs) mounted on modern cars and their expansive communication capabilities create a substantial attack surface for potential exploitation. Despite the evolution of automotive technology, the continued use of the originally insecure Controller Area Network (CAN) bus leaves in-vehicle communications inherently non-secure. In response to the absence of standardized authentication protocols within the automotive domain, researchers propose diverse solutions, each with unique strengths and vulnerabilities. However, the continuous influx of new protocols and potential oversights in meeting security requirements and essential operational features further complicate the implementability of these protocols. This paper comprehensively reviews and compares the 15 most prominent authentication protocols for the CAN bus. Our analysis emphasizes their strengths and weaknesses, evaluating their alignment with critical security requirements for automotive authentication. Additionally, we evaluate protocols based on essential operational criteria that contribute to ease of implementation in predefined infrastructures, enhancing overall reliability and reducing the probability of successful attacks. Our study reveals a prevalent focus on defending against external attackers in existing protocols, exposing vulnerabilities to internal threats. Notably, authentication protocols employing hash chains, Mixed Message Authentication Codes, and asymmetric encryption techniques emerge as the most effective approaches. Through our comparative study, we classify the considered protocols based on their security attributes and suitability for implementation, providing valuable insights for future developments in the field.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

Canadian Hydrogen Intensity Mapping Experiment (CHIME) Pathfinder

Author: Addison Graeme E.
Amiri Mandana
Bandura Kevin
Bond J. Richard
Campbell-Wilson Duncan
Cliche Jean-Francois
Connor Liam
Davis Greg
Deng Meiling
Denman Nolan
Dobbs Matt
Fandino Mateus
Gibbs Kenneth
Gilbert Adam
Halpern Mark
Hanna David
Hincks Adam D.
Hinshaw Gary
Hofer Carolin
Klages Peter
Landecker Tom L.
Masui Kiyoshi
Mena Juan
Newburgh Laura B.
Pen Ue-Li
Peterson Jeffrey B.
Recnik Andre
Shaw J. Richard
Sigurdson Kris
Sitwell Michael
Smecher Graeme
Smegal Rick
Vanderlinde Keith
Wiebe Don
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 09/06/2014
Field of study

A pathfinder version of CHIME (the Canadian Hydrogen Intensity Mapping Experiment) is currently being commissioned at the Dominion Radio Astrophysical Observatory (DRAO) in Penticton, BC. The instrument is a hybrid cylindrical interferometer designed to measure the large scale neutral hydrogen power spectrum across the redshift range 0.8 to 2.5. The power spectrum will be used to measure the baryon acoustic oscillation (BAO) scale across this poorly probed redshift range where dark energy becomes a significant contributor to the evolution of the Universe. The instrument revives the cylinder design in radio astronomy with a wide field survey as a primary goal. Modern low-noise amplifiers and digital processing remove the necessity for the analog beamforming that characterized previous designs. The Pathfinder consists of two cylinders 37\,m long by 20\,m wide oriented north-south for a total collecting area of 1,500 square meters. The cylinders are stationary with no moving parts, and form a transit instrument with an instantaneous field of view of

\sim

100\,degrees by 1-2\,degrees. Each CHIME Pathfinder cylinder has a feedline with 64 dual polarization feeds placed every

\sim

30\,cm which Nyquist sample the north-south sky over much of the frequency band. The signals from each dual-polarization feed are independently amplified, filtered to 400-800\,MHz, and directly sampled at 800\,MSps using 8 bits. The correlator is an FX design, where the Fourier transform channelization is performed in FPGAs, which are interfaced to a set of GPUs that compute the correlation matrix. The CHIME Pathfinder is a 1/10th scale prototype version of CHIME and is designed to detect the BAO feature and constrain the distance-redshift relation.Comment: 20 pages, 12 figures. submitted to Proc. SPIE, Astronomical Telescopes + Instrumentation (2014

arXiv.org e-Print Archive

NRC Publications Archive

Crossref

Evaluation of the Efficiency of an ARM-based Beowulf Cluster versus Traditional Desktop Computing for High Performance Computing

Author: Addiego Nicholas
Publication venue: Digital USD
Publication date: 23/05/2017
Field of study

In the realm of scientific computing, it has become increasingly important to focus on results driven growth (Kamil, Shalf and Storhmaier). Doing this enables researchers to continue building the rapidly expanding area of scientific discovery. However, with the growth comes a cost of the amount of resources consumed to accomplish these results. Large supercomputers are consuming power at a rate roughly fourteen thousand times that of a traditional American household (U.S. Energy Information Administration). Parallel to this, public consumers have been driving the mobile industry and the research behind it. The need to have faster and faster mobile devices that can last all day long on a single battery charge has driven the development of Advanced Reduced Instruction Set processors developed by ARM. These processors are built to perform efficiently while still maintaining the ability to perform the necessary calculations. This study looked at combining these two parallel realms and analyzing the overall efficiency and energy consumption of multiple ARM processors as compared to a traditional desktop computer. The results showed that the ARM processors were less efficient roughly by an order of two when compared to the slowest possible trial on the desktop. Several variables played a significant role in these results including the limitation on network speed and bandwidth, idle energy consumption, and individual power regulators

University of San Diego

Hubble Space Telescope: Optical telescope assembly handbook. Version 1.0

Author: Burrows Chris
Publication venue
Publication date
Field of study

The Hubble Space Telescope is described along with how its design affects the images produced at the Science Instruments. An overview is presented of the hardware. Details are presented of the focal plane, throughput of the telescope, and the point spread function (image of an unresolved point source). Some detailed simulations are available of this, which might be useful to observers in planning their observations and in reducing their data

NASA Technical Reports Server