26,155 research outputs found

    Control speculation for energy-efficient next-generation superscalar processors

    Get PDF
    Conventional front-end designs attempt to maximize the number of "in-flight" instructions in the pipeline. However, branch mispredictions cause the processor to fetch useless instructions that are eventually squashed, increasing front-end energy and issue queue utilization and, thus, wasting around 30 percent of the power dissipated by a processor. Furthermore, processor design trends lead to increasing clock frequencies by lengthening the pipeline, which puts more pressure on the branch prediction engine since branches take longer to be resolved. As next-generation high-performance processors become deeply pipelined, the amount of wasted energy due to misspeculated instructions will go up. The aim of this work is to reduce the energy consumption of misspeculated instructions. We propose selective throttling, which triggers different power-aware techniques (fetch throttling, decode throttling, or disabling the selection logic) depending on the branch prediction confidence level. Results show that combining fetch-bandwidth reduction along with select-logic disabling provides the best performance in terms of overall energy reduction and energy-delay product improvement (14 percent and 10 percent, respectively, for a processor with a 22-stage pipeline and 16 percent and 13 percent, respectively, for a processor with a 42-stage pipeline).Peer ReviewedPostprint (published version

    Software trace cache

    Get PDF
    We explore the use of compiler optimizations, which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying hardware resources regardless of the specific details of the processor/architecture in order to increase fetch performance. The Software Trace Cache (STC) is a code layout algorithm with a broader target than previous layout optimizations. We target not only an improvement in the instruction cache hit rate, but also an increase in the effective fetch width of the fetch engine. The STC algorithm organizes basic blocks into chains trying to make sequentially executed basic blocks reside in consecutive memory positions, then maps the basic block chains in memory to minimize conflict misses in the important sections of the program. We evaluate and analyze in detail the impact of the STC, and code layout optimizations in general, on the three main aspects of fetch performance; the instruction cache hit rate, the effective fetch width, and the branch prediction accuracy. Our results show that layout optimized, codes have some special characteristics that make them more amenable for high-performance instruction fetch. They have a very high rate of not-taken branches and execute long chains of sequential instructions; also, they make very effective use of instruction cache lines, mapping only useful instructions which will execute close in time, increasing both spatial and temporal locality.Peer ReviewedPostprint (published version

    A hybrid neuro--wavelet predictor for QoS control and stability

    Full text link
    For distributed systems to properly react to peaks of requests, their adaptation activities would benefit from the estimation of the amount of requests. This paper proposes a solution to produce a short-term forecast based on data characterising user behaviour of online services. We use \emph{wavelet analysis}, providing compression and denoising on the observed time series of the amount of past user requests; and a \emph{recurrent neural network} trained with observed data and designed so as to provide well-timed estimations of future requests. The said ensemble has the ability to predict the amount of future user requests with a root mean squared error below 0.06\%. Thanks to prediction, advance resource provision can be performed for the duration of a request peak and for just the right amount of resources, hence avoiding over-provisioning and associated costs. Moreover, reliable provision lets users enjoy a level of availability of services unaffected by load variations

    Magnetic field experiment for Voyagers 1 and 2

    Get PDF
    The magnetic field experiment to be carried on the Voyager 1 and 2 missions consists of dual low field (LFM) and high field magnetometer (HFM) systems. The dual systems provide greater reliability and, in the case of the LFM's, permit the separation of spacecraft magnetic fields from the ambient fields. Additional reliability is achieved through electronics redundancy. The wide dynamic ranges of plus or minus 0.5G for the LFM's and plus or minus 20G for the HFM's, low quantization uncertainty of plus or minus 0.002 gamma in the most sensitive (plus or minus 8 gamma) LFM range, low sensor RMS noise level of 0.006 gamma, and use of data compaction schemes to optimize the experiment information rate all combine to permit the study of a broad spectrum of phenomena during the mission. Planetary fields at Jupiter, Saturn, and possibly Uranus; satellites of these planets; solar wind and satellite interactions with the planetary fields; and the large-scale structure and microscale characteristics of the interplanetary magnetic field are studied. The interstellar field may also be measured

    A marker of biological ageing predicts adult risk preference in European starlings, Sturnus vulgaris

    Get PDF
    Why are some individuals more prone to gamble than others? Animals often show preferences between 2 foraging options with the same mean reward but different degrees of variability in the reward, and such risk preferences vary between individuals. Previous attempts to explain variation in risk preference have focused on energy budgets, but with limited empirical support. Here, we consider whether biological ageing, which affects mortality and residual reproductive value, predicts risk preference. We studied a cohort of European starlings (Sturnus vulgaris) in which we had previously measured developmental erythrocyte telomere attrition, an established integrative biomarker of biological ageing. We measured the adult birds’ preferences when choosing between a fixed amount of food and a variable amount with an equal mean. After controlling for change in body weight during the experiment (a proxy for energy budget), we found that birds that had undergone greater developmental telomere attrition were more risk averse as adults than were those whose telomeres had shortened less as nestlings. Developmental telomere attrition was a better predictor of adult risk preference than either juvenile telomere length or early-life food supply and begging effort. Our longitudinal study thus demonstrates that biological ageing, as measured via developmental telomere attrition, is an important source of lasting differences in adult risk preferences

    Asynchronous Data Processing Platforms for Energy Efficiency, Performance, and Scalability

    Get PDF
    The global technology revolution is changing the integrated circuit industry from the one driven by performance to the one driven by energy, scalability and more-balanced design goals. Without clock-related issues, asynchronous circuits enable further design tradeoffs and in operation adaptive adjustments for energy efficiency. This dissertation work presents the design methodology of the asynchronous circuit using NULL Convention Logic (NCL) and multi-threshold CMOS techniques for energy efficiency and throughput optimization in digital signal processing circuits. Parallel homogeneous and heterogeneous platforms implementing adaptive dynamic voltage scaling (DVS) based on the observation of system fullness and workload prediction are developed for balanced control of the performance and energy efficiency. Datapath control logic with NULL Cycle Reduction (NCR) and arbitration network are incorporated in the heterogeneous platform for large scale cascading. The platforms have been integrated with the data processing units using the IBM 130 nm 8RF process and fabricated using the MITLL 90 nm FDSOI process. Simulation and physical testing results show the energy efficiency advantage of asynchronous designs and the effective of the adaptive DVS mechanism in balancing the energy and performance in both platforms

    Aerospace Medicine and Biology. A continuing bibliography with indexes

    Get PDF
    This bibliography lists 244 reports, articles, and other documents introduced into the NASA scientific and technical information system in February 1981. Aerospace medicine and aerobiology topics are included. Listings for physiological factors, astronaut performance, control theory, artificial intelligence, and cybernetics are included

    A classification of techniques for the compensation of time delayed processes. Part 2: Structurally optimised controllers

    Get PDF
    Following on from Part 1, Part 2 of the paper considers the use of structurally optimised controllers to compensate time delayed processes
    • …
    corecore