6 research outputs found

    Adaptive transaction scheduling for transactional memory systems

    Get PDF
    Transactional memory systems are expected to enable parallel programming at lower programming complexity, while delivering improved performance over traditional lock-based systems. Nonetheless, there are certain situations where transactional memory systems could actually perform worse. Transactional memory systems can outperform locks only when the executing workloads contain sufficient parallelism. When the workload lacks inherent parallelism, launching excessive transactions can adversely degrade performance. These situations will actually become dominant in future workloads when large-scale transactions are frequently executed. In this thesis, we propose a new paradigm called adaptive transaction scheduling to address this issue. Based on the parallelism feedback from applications, our adaptive transaction scheduler dynamically dispatches and controls the number of concurrently executing transactions. In our case study, we show that our low-cost mechanism not only guarantees that hardware transactional memory systems perform no worse than a single global lock, but also significantly improves performance for both hardware and software transactional memory systems.M.S.Committee Chair: Lee, Hsien-Hsin; Committee Member: Blough, Douglas; Committee Member: Yalamanchili, Sudhaka

    DLL-Conscious Instruction Fetch Optimization for SMT Processors

    Get PDF
    Simultaneous multithreading (SMT) processors can issue multiple instructions from distinct processes or threads in the same cycle. This technique effectively increases the overall throughput by keeping the pipeline resources more occupied at the potential expense of reducing single thread performance due to resource sharing. In the software domain, an increasing number of Dynamically Linked Libraries (DLL) are used by applications and operating systems, providing better flexibility and modularity, and enabling code sharing. It is observed that a significant amount of execution time in software today is spent in executing standard DLL instructions, that are shared among multiple threads or processes. However, for an SMT processor with a virtually-indexed based cache implementation, existing instruction fetching mechanisms can induce unnecessary false cache misses caused by the DLL-based instructions, which were intended to be shared. This problem is more conspicuous when multiple independent threads are executing concurrently in an SMT processor. This work investigates an often-neglected form of contention between running threads in the I-TLB and I-cache caused by DLLs. To address these shortcomings, we propose a system level technique involving a light-weight modification in the microarchitecture and the OS. By exploiting the nature of the DLLs in our new architecture, we are able to reinstate physical sharing of the DLLs in an SMT machine. Using Microsoft Windows based applications, our simulation results show that the optimized instruction fetching mechanism can reduce the number of DLL misses up to 5.5 times and improve the instruction cache hit rates by up to 62%, resulting in upto 30% DLL IPC improvements and upto 15% overall IPC improvements.M.S.Committee Chair: Hsien-Hsin S. Lee; Committee Member: Sudhakar Yalamanchili; Committee Member: Sung Kyu Li

    Noise-direct: a technique for power supply noise aware floorplanning using microarchitecture profiling

    No full text
    This paper proposes Noise-Direct, a design methodology for power integrity aware floorplanning, using microarchitectural feedback to guide module placement. Stringent power constraints have led microprocessor designers to incorporate aggressive power saving techniques such as clock-gating, that place a significant burden on the power delivery network. While the application of extensive clock-gating can effectively reduce power consumption, unfortunately, it can also induce large inductive noise (di/dt), resulting in signal integrity and reliability issues. To combat these problems, processors are usually designed for the worst-case current consumption scenario using adequate supply voltage and decoupling capacitances. To tackle high-frequency inductive noise and potential IR drops, we propose a novel design methodology that integrates microarchitectural profiling feedback into the floorplanning process. We present two microarchitectural metrics to quantify the noise susceptibility of a module:self weighting and correlation weighting. By using these metrics in a force-directed floorplanning algorithm to assign power pin affinity to modules, we can quickly converge to a design for average-case current consumption. By designing for the average-case and employing dynamic di/dt control for the worst-case, we can ensure that a chip is noise-tolerant without exceeding decap budget constraints. Our observations showed that certain functional modules in a processor exhibit consistent and highly correlated switching activity, that can be used to guide module placement distance from power pins. The experimental results demonstrate that the force-directed floorplanning technique can effectively suppress supply noise experienced by modules, reduce the total number of supply-noise margin violations, and achieve a floorplan with considerably lower IR drop, as compared to a wire-length driven floorplan. 1

    A Floorplan-Aware Dynamic Inductive Noise Controller for Reliable 2D and 3D Microprocessors

    Get PDF
    Power delivery is a growing reliability concern in microprocessors as the industry moves toward feature-rich, power-hungrier designs. To battle the ever-aggravating power consumption, modern microprocessor designers or researchers propose and apply aggressive power-saving techniques in the form of clock-gating and/or power-gating in order to operate the processor within a given power envelope. However, these techniques often lead to high-frequency current variations, which can stress the power delivery system and jeopardize reliability due to inductive noise (L di/dt) in the power supply network. In addition, with the advent of 3D stacked IC technology that facilitates the design of processors with much higher module density, the design of a low impedance power-delivery network can be a daunting challenge. To counteract these issues, modern microprocessors are designed to operate under the worst-case current assumption by deploying adequate decoupling capacitance. With the lowering of supply voltages and increased leakage power and current consumption, designing a processor for the worst case is becoming less appealing. In this paper, we propose a new dynamic inductive-noise controlling mechanism at the microarchitectural level that will limit the on-die current demand within predefined bounds, regardless of the native power and current characteristics of running applications. By dynamically monitoring the access patterns of microarchitectural modules, our mechanism can effectively limit simultaneous switching activity of close-by modules, thereby leveling voltage ringing at local power-pins. Compared to prior art, our di/dt controller is the first that takes the processor's floorplan as well as its power-pin distribution into account to provide a finer-grained control with minimal performance degradation. Based on the evaluation results using 2D and 3D floorplans, we show that our techniques can significantly improve inductive noise induced by current demand variation and reduce the average current variability by up to 7 times with an average performance overhead of 4.0% (2D floorplan) and 3.8% (3D floorplan)
    corecore