4,034 research outputs found

    HeteroCore GPU to exploit TLP-resource diversity

    Get PDF

    Run-time Energy Management for Mobiles

    Get PDF
    Due to limited energy resources, mobile computing requires an energy-efficient a rchitecture. The dynamic nature of a mobile environment demands an architecture that allows adapting to (quickly) changing conditions. The mobile has to adapt d ynamically to new circumstances in the best suitable manner. The hardware and so ftware architecture should be able to support such adaptability and minimize the energy consumption by making resource allocation decisions at run-time. To make these decisions effective, a tradeoff has to be made between computation , communication and initialization costs (both time and energy). This paper describes our approach to construct a model that supports taking such decisions

    Adaptive Optimizations for Stream-based Workflows

    Get PDF

    Doctor of Philosophy

    Get PDF
    dissertationWith the explosion of chip transistor counts, the semiconductor industry has struggled with ways to continue scaling computing performance in line with historical trends. In recent years, the de facto solution to utilize excess transistors has been to increase the size of the on-chip data cache, allowing fast access to an increased portion of main memory. These large caches allowed the continued scaling of single thread performance, which had not yet reached the limit of instruction level parallelism (ILP). As we approach the potential limits of parallelism within a single threaded application, new approaches such as chip multiprocessors (CMP) have become popular for scaling performance utilizing thread level parallelism (TLP). This dissertation identifies the operating system as a ubiquitous area where single threaded performance and multithreaded performance have often been ignored by computer architects. We propose that novel hardware and OS co-design has the potential to significantly improve current chip multiprocessor designs, enabling increased performance and improved power efficiency. We show that the operating system contributes a nontrivial overhead to even the most computationally intense workloads and that this OS contribution grows to a significant fraction of total instructions when executing several common applications found in the datacenter. We demonstrate that architectural improvements have had little to no effect on the performance of the OS over the last 15 years, leaving ample room for improvements. We specifically consider three potential solutions to improve OS execution on modern processors. First, we consider the potential of a separate operating system processor (OSP) operating concurrently with general purpose processors (GPP) in a chip multiprocessor organization, with several specialized structures acting as efficient conduits between these processors. Second, we consider the potential of segregating existing caching structures to decrease cache interference between the OS and application. Third, we propose that there are components within the OS itself that should be refactored to be both multithreaded and cache topology aware, which in turn, improves the performance and scalability of many-threaded applications
    • 

    corecore