Search CORE

80 research outputs found

Managing SMT Resource Usage through Speculative Instruction Window Weighting

Author: Seznec André
Vandierendonck Hans
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

Simultaneous multithreading processors dynamically share processor resources between multiple threads. In general, shared SMT resources may be managed explicitly, e.g. by dynamically setting queue occupation bounds for each thread as in the DCRA and Hill-Climbing policies. Alternatively, resources may be managed implicitly, i.e. resource usage is controlled by placing the desired instruction mix in the resources. In this case, the main resource management tool is the instruction fetch policy which must predict the behavior of each thread (branch mispredictions, long-latency loads, etc.) as it fetches instructions. In this paper, we present the use of Speculative Instruction Window Weighting (SIWW) to bridge the gap between implicit and explicit SMT fetch policies. SIWW estimates for each thread the amount of outstanding work in the processor pipeline. Fetch proceeds for the thread with the least amount of work left. SIWW policies are implicit as fetch proceeds for the thread with the least amount of work left. They are also explicit as maximum resource allocation can also be set. SIWW can use and combine virtually any of the indicators that were previously proposed for guiding the instruction fetch policy (number of in-flight instructions, number of low confidence branches, number of predicted cache misses, etc.). Therefore, SIWW is an \emph{approach to designing SMT fetch policies}, rather than a particular fetch policy. Targeting fairness or throughput is often contradictory and a SMT scheduling policy often optimizes only one performance metric at the sacrifice of the other metric. Our simulations show that the SIWW fetch policy can achieve at the same time state-of-the-art throughput, state-of-the-art fairness and state-of-the-art harmonic performance mean

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Ghent University Academic Bibliography

HAL-Rennes 1

Hyperheuristics for explicit resource partitioning in simultaneous multithreaded processors

Author: Güney İsa A.
Küçük Gürhan
Poyraz Kemal
Özcan Ender
Publication venue: 'The Scientific and Technological Research Council of Turkey'
Publication date: 28/03/2020
Field of study

Repository@Nottingham

Load sharing for optimistic parallel simulations on multicore machines

Author: PELLEGRINI ALESSANDRO
QUAGLIA Francesco
VITALI Roberto
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Parallel Discrete Event Simulation (PDES) is based on the partitioning of the simulation model into distinct Logical Processes (LPs), each one modeling a portion of the entire system, which are allowed to execute simulation events concurrently. This allows exploiting parallel computing architectures to speedup model execution, and to make very large models tractable. In this article we cope with the optimistic approach to PDES, where LPs are allowed to concurrently process their events in a speculative fashion, and rollback/ recovery techniques are used to guarantee state consistency in case of causality violations along the speculative execution path. Particularly, we present an innovative load sharing approach targeted at optimizing resource usage for fruitful simulation work when running an optimistic PDES environment on top of multi-processor/multi-core machines. Beyond providing the load sharing model, we also define a load sharing oriented architectural scheme, based on a symmetric multi-threaded organization of the simulation platform. Finally, we present a real implementation of the load sharing architecture within the open source ROme OpTimistic Simulator (ROOT-Sim) package. Experimental data for an assessment of both viability and effectiveness of our proposal are presented as well. Copyright is held by author/owner(s)

ART

Archivio della ricerca- Università di Roma La Sapienza

Improving the Energy Efficiency of Microprocessor Cores Through Accurate Resource Utilisation Prediction

Author: Court Craig A.
Publication venue: Computing, Imperial College London
Publication date: 01/07/2012
Field of study

CMOS technology scaling improves the speed and functionality of microprocessors by reducing the size of transistors. Static power dissipation also increases as a result of scaling however, and has been identified as a limiting factor in technology scaling. As current technology approaches that limit, techniques are required both at the technology-level and in the architecture design to reduce sub-threshold leakage, which accounts for the majority of static power dissipation. This thesis presents an approach to predict the idle periods of execution units at runtime and power-gate them during these periods to eliminate their static power leakage. We exploit similar execution characteristics across loop iterations to build a prediction of the units required to execute an entire loop from the units used over the first few iterations. The utilisation of each execution unit is monitored for each iteration, and thresholds are used to determine which units should be power-gated for the remainder of the loop. Three techniques are presented: Loop-Directed Mothballing (LDM), Extended Loop-Directed Mothballing (ELDM) and schedule balancing. LDM power-gates execution units only during innermost loops, which are simple to detect at runtime. ELDM extends this method to all loops using loop entry and exit information gathered offline. The balancing scheduler is developed to balance the types of instruction issued each cycle, to encourage reuse of execution units and make unnecessary units easier to detect. Extensive simulation using traces of 16 benchmarks from the SPEC CPU2006 suite demonstrates that LDM reduces the energy-delay product of our simulated superscalar processor by 10.3%. For traces with a low proportion of executed instructions inside innermost loops, ELDM improves the energy-delay product by up to 13% by allowing the technique to be applied to other loops in the trace. Employing schedule balancing with ELDM achieves similar savings, and simplifies the hardware required to make predictions

Spiral - Imperial College Digital Repository

Fundamental Approaches to Software Engineering

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2022
Field of study

This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications

Directory of Open Access Books (DOAB)

Fundamental Approaches to Software Engineering

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library

Affordable techniques for dependable microprocessor design

Author: Kim Seongwoo
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2001
Field of study

As high computing power is available at an affordable cost, we rely on microprocessor-based systems for much greater variety of applications. This dependence indicates that a processor failure could have more diverse impacts on our daily lives. Therefore, dependability is becoming an increasingly important quality measure of microprocessors.;Temporary hardware malfunctions caused by unstable environmental conditions can lead the processor to an incorrect state. This is referred to as a transient error or soft error. Studies have shown that soft errors are the major source of system failures. This dissertation characterizes the soft error behavior on microprocessors and presents new microarchitectural approaches that can realize high dependability with low overhead.;Our fault injection studies using RISC processors have demonstrated that different functional blocks of the processor have distinct susceptibilities to soft errors. The error susceptibility information must be reflected in devising fault tolerance schemes for cost-sensitive applications. Considering the common use of on-chip caches in modern processors, we investigated area-efficient protection schemes for memory arrays. The idea of caching redundant information was exploited to optimize resource utilization for increased dependability. We also developed a mechanism to verify the integrity of data transfer from lower level memories to the primary caches. The results of this study show that by exploiting bus idle cycles and the information redundancy, an almost complete check for the initial memory data transfer is possible without incurring a performance penalty.;For protecting the processor\u27s control logic, which usually remains unprotected, we propose a low-cost reliability enhancement strategy. We classified control logic signals into static and dynamic control depending on their changeability, and applied various techniques including commit-time checking, signature caching, component-level duplication, and control flow monitoring. Our schemes can achieve more than 99% coverage with a very small hardware addition. Finally, a virtual duplex architecture for superscalar processors is presented. In this system-level approach, the processor pipeline is backed up by a partially replicated pipeline. The replication-based checker minimizes the design and verification overheads. For a large-scale superscalar processor, the proposed architecture can bring 61.4% reduction in die area while sustaining the maximum performance

Digital Repository @ Iowa State University (ISU)

CiteSeerX

A comparison of processing techniques for producing prototype injection moulding inserts.

Author: Ewart Paul
Publication venue: University of Auckland
Publication date: 20/05/2019
Field of study

This project involves the investigation of processing techniques for producing low-cost moulding inserts used in the particulate injection moulding (PIM) process. Prototype moulds were made from both additive and subtractive processes as well as a combination of the two. The general motivation for this was to reduce the entry cost of users when considering PIM. PIM cavity inserts were first made by conventional machining from a polymer block using the pocket NC desktop mill. PIM cavity inserts were also made by fused filament deposition modelling using the Tiertime UP plus 3D printer. The injection moulding trials manifested in surface finish and part removal defects. The feedstock was a titanium metal blend which is brittle in comparison to commodity polymers. That in combination with the mesoscale features, small cross-sections and complex geometries were considered the main problems. For both processing methods, fixes were identified and made to test the theory. These consisted of a blended approach that saw a combination of both the additive and subtractive processes being used. The parts produced from the three processing methods are investigated and their respective merits and issues are discussed

Wintec Research Archive

Reducing risk in pre-production investigations through undergraduate engineering projects.

Author: Ewart Paul
Publication venue: University of Auckland
Publication date: 20/05/2019
Field of study

This poster is the culmination of final year Bachelor of Engineering Technology (B.Eng.Tech) student projects in 2017 and 2018. The B.Eng.Tech is a level seven qualification that aligns with the Sydney accord for a three-year engineering degree and hence is internationally benchmarked. The enabling mechanism of these projects is the industry connectivity that creates real-world projects and highlights the benefits of the investigation of process at the technologist level. The methodologies we use are basic and transparent, with enough depth of technical knowledge to ensure the industry partners gain from the collaboration process. The process we use minimizes the disconnect between the student and the industry supervisor while maintaining the academic freedom of the student and the commercial sensitivities of the supervisor. The general motivation for this approach is the reduction of the entry cost of the industry to enable consideration of new technologies and thereby reducing risk to core business and shareholder profits. The poster presents several images and interpretive dialogue to explain the positive and negative aspects of the student process

Wintec Research Archive