Search CORE

11 research outputs found

Interfacing synchronous and asynchronous modules within a high-speed pipeline

Author: Myers Chris J.
Sjogren Allen E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

Journal ArticleAbstract-This paper describes a new technique for integrating asynchronous modules within a high-speed synchronous pipeline. Our design eliminates potential metastability problems by using a clock generated by a stoppable ring oscillator, which is capable of driving the large clock load found in present day microprocessors. Using the ATACS design tool, we designed highly optimized transistor-level circuits to control the ring oscillator and generate the clock and handshake signals with minimal overhead. Our interface architecture requires no redesign of the synchronous circuitry. Incorporating asynchronous modules in a high-speed pipeline improves performance by exploiting data-dependent delay variations. Since the speed of the synchronous circuitry tracks the speed of the ring oscillator under different processes, temperatures, and voltages, the entire chip operates at the speed dictated by the current operating conditions, rather than being governed by the worst case conditions. These two factors together can lead to a significant improvement in average-case performance. The interface design is simulated using the 0.6- m HP CMOS14B process in HSPICE

The University of Utah: J. Willard Marriott Digital Library

Using MCD-DVS for dynamic thermal management performance improvement

Author: Chaparro Pedro
González Colás Antonio María
González González José
Magklis Grigorios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

With chip temperature being a major hurdle in microprocessor design, techniques to recover the performance loss due to thermal emergency mechanisms are crucial in order to sustain performance growth. Many techniques for power reduction in the past and some on thermal management more recently have contributed to alleviate this problem. Probably the most important thermal control technique is dynamic voltage and frequency scaling (DVS) which allows for almost cubic reduction in power with worst-case performance penalty only linear. So far, DVS techniques for temperature control have been studied at the chip level. Finer grain DVS is feasible if a globally-asynchronous locally-synchronous (GALS) design style is employed. GALS, also known as multiple-clock domain (MCD), allows for an independent voltage and frequency control for each one of the clock domains that are part of the chip. There are several studies on DVS for GALS that aim to improve energy and power efficiency but not temperature. This paper proposes and analyses the usage of DVS at the domain level to control temperature in a clustered MCD microarchitecture with the goal of improving the performance of applications that do not meet the thermal constraints imposed by the designers.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

Author: Gujarathi Hemal S
McDonald-Maier Klaus D
Qadri Muhammad Yasir
Publication venue: 'Academy Publisher'
Publication date: 01/01/2009
Field of study

The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

University of Essex Research Repository

CiteSeerX

Crossref

Power efficient resource scaling in partitioned architectures through dynamic heterogeneity

Author: Balasubramonian Rajeev
Muralimanohar Naveen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Journal ArticleThe ever increasing demand for high clock speeds and the desire to exploit abundant transistor budgets have resulted in alarming increases in processor power dissipation. Partitioned (or clustered) architectures have been proposed in recent years to address scalability concerns in future billion-transistor microprocessors. Our analysis shows that increasing processor resources in a clustered architecture results in a linear increase in power consumption, while providing diminishing improvements in single-thread performance. To preserve high performance to power ratios, we claim that the power consumption of additional resources should be in proportion to the performance improvements they yield. Hence, in this paper, we propose the implementation of heterogeneous clusters that have varying delay and power characteristics. A cluster's performance and power characteristic is tuned by scaling its frequency and novel policies dynamically assign frequencies to clusters, while attempting to either meet a fixed power budget or minimize a metric such as Energy×Delay2 (ED2). By increasing resources in a power-efficient manner, we observe a 11% improvement in ED2 and a 22.4% average reduction in peak temperature, when compared to a processor with homogeneous units. Our proposed processor model also provides strategies to handle thermal emergencies that have a relatively low impact on performance

The University of Utah: J. Willard Marriott Digital Library

Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling

Author: Balasubramonian Rajeev
Semeraro Greg
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Journal ArticleAs clock frequency increases and feature size decreases, clock distribution and wire delays present a growing challenge to the designers of singly-clocked, globally synchronous systems. We describe an alternative approach, which we call a Multiple Clock Domain (MCD) processor in which the chip is divided into several (coarse-grained) clock domains, within which independent voltage and frequency scaling can be performed. Boundaries between domains are chosen to exploit existing queues, thereby minimizing inter-domain synchronization costs. We propose four clock domains, corresponding to the front end (including LI instruction cache), integer units, floating point units, and load-store units (including Ll data cache and L2 cache). We evaluate this design using a simulation infrastructure based on SimpleScalar and Wattch. In an attempt to quantify potential energy savings independent of any particular on-line control strategy, we use of-line analysis of traces from a single-speed run of each of our benchmark applications to identify profitable reconfiguration points for a subsequent dynamic scaling run. Dynamic runs incorporate a detailed model of inter-domain synchronization delays, with latencies for intra-domain scaling similar to the whole-chip scaling latencies of Intel XScale and Transmeta LongRun technologies. Using applications from the MediaBench, Olden, and SPEC2000 benchmark suites, we obtain an average energy-delay product improvement of 20% with MCD compared to a modest 3% savings from voltage scaling a single clock and voltage system

The University of Utah: J. Willard Marriott Digital Library

Profile-based Dynamic Voltage and Frequency Scaling for a Multiple Clock Domain Microprocessor

Author: Albonesi David H.
Dropsho Steven
Magklis Grigorios
Scott Michael L.
Semeraro Greg
Publication venue
Publication date: 30/11/2006
Field of study

A Multiple Clock Domain (MCD) processor addresses the challenges of clock distribution and power dissipation by dividing a chip into several (coarse-grained) clock domains, allowing frequency and voltage to be reduced in domains that are not currently on the application’s critical path. Given a reconfiguration mechanism capable of choosing appropriate times and values for voltage/frequency scaling, an MCD processor has the potential to achieve significant energy savings with low performance degradation. Early work on MCD processors evaluated the potential for energy savings by manually inserting reconfiguration instructions into applications, or by employing an oracle driven by off-line analysis of (identical) prior program runs. Subsequent work developed a hardware-based on-line mechanism that averages 75–85% of the energy-delay improvement achieved via off-line analysis. In this paper we consider the automatic insertion of reconfiguration instructions into applications, using profiledriven binary rewriting. Profile-based reconfiguration introduces the need for “training runs” prior to production use of a given application, but avoids the hardware complexity of on-line reconfiguration. It also has the potential to yield significantly greater energy savings. Experimental results (training on small data sets and then running on larger, alternative data sets) indicate that the profile-driven approach is more stable than hardware-based reconfiguration, and yields virtually all of the energy-delay improvement achieved via off-line analysis

Infoscience - École polytechnique fédérale de Lausanne

Dynamic Frequency and Voltage Control for a Multiple Clock Domain Microarchitecture

Author: Albonesi David H.
Dropsho Steven
Dwarkadas Sandhya
Magklis Grigorios
Scott Michael L.
Semeraro Greg
Publication venue
Publication date: 30/11/2006
Field of study

We describe the design, analysis, and performance of an on-line algorithm to dynamically control the frequency/voltage of a Multiple Clock Domain (MCD) microarchitecture. The MCD microarchitecture allows the frequency/voltage of micrprocessor regions to be adjusted independently and dynamically, allowing enery savings when the frequency of some regions can be reduced without significantly impacting performance. Our algorithm achieves on average a 19.0% reduction in Energy Per Instriction (EPI), a 3.2% increase in Cycles Per Instruction (CPI), a 16.7% improvement in Energy–Delay Product, and a Power Savings to Performance Degradation ratio of 4.6. Traditional frequency/voltage scaling techniques which apply reductions globally to a fully synchronous processor achieve a Power Savings to Performance Degradation ratio of only 2–3. Our Energy–Delay Product improvement is 85.5% of what has been achieved using an off–line algorithm. These results were achieved using a broad range of applications from the MediaBench, Olden, and Spec2000 benchmark suites using an algorithm we show to require minimal hardware resources

Infoscience - École polytechnique fédérale de Lausanne

Alarm Forecasting in Natural Gas Pipelines

Author: Quinn Colin
Publication venue: e-Publications@Marquette
Publication date: 01/04/2020
Field of study

This thesis examines alarm forecasting methods for a natural gas production pipeline to assure the efficient transportation of high-quality natural gas. Natural gas production companies use pipelines to transport natural gas from the extraction well to a distribution point. Forecasting natural gas pipeline pressure alarms helps control room operators maintain a functioning pipeline and avoid costly down time. As gas enters the pipeline and travels to the distribution point, it is expected that the gas meets certain specifications set in place by either state law or the customer receiving the gas. If the gas meets these standards and is accepted at the distribution point, the pipeline is referred to as being in a steady-state. If the gas does not meet these standards, the production company runs the risk of being shut-in, or being unable to flow any more gas through the distribution point until the poor-quality gas is removed.Sensors are used to collect real-time gas quality information from within the pipe, and alarms are used to alert the control operators when a threshold is exceeded. If operators fail to keep the pipeline’s gas quality within an acceptable range, the company risks being shut¬¬-in or rupturing the pipeline. Predicting gas quality alarms enables operators to act earlier to avoid being shut-in and is a form of predictive maintenance. We forecast alarms by using a 10th-order autoregressive model, autoregressive model with exogenous variable, simple exponential smoothing with drift (Theta Method) and an artificial neural network with alarm thresholds. The alarm thresholds are defined by the production company and are occasionally adjusted to meet current environment conditions. The results of the alarm forecasting method show that we accurately forecast natural gas pipeline alarms up to a 30-minute time horizon. This translates into sensitivity rates that drop from around 100% at one minute to 82.7% at a 30-minute forecast horizon. This means that at 30 minutes, we correctly forecast 82.7% of the alarms. All alarm forecasting models outperform the state-or-the-art forecaster used by the production company, with the artificial neural network performing the best

epublications@Marquette

Design methodology and productivity improvement in high speed VLSI circuits

Author: Hossain KM Mozammel
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2017
Field of study

2017 Spring.Includes bibliographical references.To view the abstract, please see the full text of the document

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Interfacing Synchronous and Asynchronous Modules Within a High-Speed Pipeline

Author: Allen E. Sjogren
Chris J. Myers
Publication venue
Publication date: 01/01/1997
Field of study

This paper describes a new technique for integrating asynchronous modules within a high-speed synchronous pipeline. Our design eliminates potential metastability problems by using a clock generated by a stoppable ring oscillator, which is capable of driving the large clock load found in present day microprocessors. Using the ATACS design tool, we designed highly optimized transistor-level circuits to control the ring oscillator and generate the clock and handshake signals with minimal overhead. Our interface architecture requires no redesign of the synchronous circuitry. Incorporating asynchronous modules in a highspeed pipeline improves performance by exploiting datadependent delay variations. Since the speed of the synchronous circuitry tracks the speed of the ring oscillator under different processes, temperatures, and voltages, the entire chip operates at the speed dictated by the current operating conditions, rather than being governed by the worst-case conditions. These two fact..

CiteSeerX

The University of Utah: J. Willard Marriott Digital Library