97 research outputs found

    Improving the Performance and Energy Efficiency of GPGPU Computing through Adaptive Cache and Memory Management Techniques

    Get PDF
    Department of Computer Science and EngineeringAs the performance and energy efficiency requirement of GPGPUs have risen, memory management techniques of GPGPUs have improved to meet the requirements by employing hardware caches and utilizing heterogeneous memory. These techniques can improve GPGPUs by providing lower latency and higher bandwidth of the memory. However, these methods do not always guarantee improved performance and energy efficiency due to the small cache size and heterogeneity of the memory nodes. While prior works have proposed various techniques to address this issue, relatively little work has been done to investigate holistic support for memory management techniques. In this dissertation, we analyze performance pathologies and propose various techniques to improve memory management techniques. First, we investigate the effectiveness of advanced cache indexing (ACI) for high-performance and energy-efficient GPGPU computing. Specifically, we discuss the designs of various static and adaptive cache indexing schemes and present implementation for GPGPUs. We then quantify and analyze the effectiveness of the ACI schemes based on a cycle-accurate GPGPU simulator. Our quantitative evaluation shows that ACI schemes achieve significant performance and energy-efficiency gains over baseline conventional indexing scheme. We also analyze the performance sensitivity of ACI to key architectural parameters (i.e., capacity, associativity, and ICN bandwidth) and the cache indexing latency. We also demonstrate that ACI continues to achieve high performance in various settings. Second, we propose IACM, integrated adaptive cache management for high-performance and energy-efficient GPGPU computing. Based on the performance pathology analysis of GPGPUs, we integrate state-of-the-art adaptive cache management techniques (i.e., cache indexing, bypassing, and warp limiting) in a unified architectural framework to eliminate performance pathologies. Our quantitative evaluation demonstrates that IACM significantly improves the performance and energy efficiency of various GPGPU workloads over the baseline architecture (i.e., 98.1% and 61.9% on average, respectively) and achieves considerably higher performance than the state-of-the-art technique (i.e., 361.4% at maximum and 7.7% on average). Furthermore, IACM delivers significant performance and energy efficiency gains over the baseline GPGPU architecture even when enhanced with advanced architectural technologies (e.g., higher capacity, associativity). Third, we propose bandwidth- and latency-aware page placement (BLPP) for GPGPUs with heterogeneous memory. BLPP analyzes the characteristics of a application and determines the optimal page allocation ratio between the GPU and CPU memory. Based on the optimal page allocation ratio, BLPP dynamically allocate pages across the heterogeneous memory nodes. Our experimental results show that BLPP considerably outperforms the baseline and state-of-the-art technique (i.e., 13.4% and 16.7%) and performs similar to the static-best version (i.e., 1.2% difference), which requires extensive offline profiling.clos

    Inter-Organizational Information Systems Visibility in Buyer-Supplier Relationships: Buyer and Supplier Perspectives

    Get PDF
    Many researchers have called for the need to improve the understanding of the concept and working of supply chain visibility. The facilitating role of inter-organizational information systems (IOIS) in achieving SC visibility has received inadequate research attention. This paper is to elaborate on the novel concept of IOIS visibility and to look into the antecedents and consequences of IOIS visibility. Further, investigating SC cooperation from the perspectives of both partners is important, especially when channel partners depend on each other and there can be asymmetries in IOIS visibility. This study attempts to accommodate both partners’ perspectives in IOIS visibility. The data that this study requires were collected from 51 matched pairs of intermediate producers of telecommunication equipment components and their immediate suppliers. The results show that IOIS visibility from the supplier’s perspectives is an important predictor of supply chain performance. In turn, IOIS visibility is significantly influenced by supply chain partner’s internal IS integration and inter-organizational IT infrastructure compatibility. The impact of asymmetries in IOIS visibility on supply chain performance is also investigated

    DiGeorge syndrome who developed lymphoproliferative mediastinal mass

    Get PDF
    DiGeorge syndrome is an immunodeficient disease associated with abnormal development of 3rd and 4th pharyngeal pouches. As a hemizygous deletion of chromosome 22q11.2 occurs, various clinical phenotypes are shown with a broad spectrum. Conotruncal cardiac anomalies, hypoplastic thymus, and hypocalcemia are the classic triad of DiGeorge syndrome. As this syndrome is characterized by hypoplastic or aplastic thymus, there are missing thymic shadow on their plain chest x-ray. Immunodeficient patients are traditionally known to be at an increased risk for malignancy, especially lymphoma. We experienced a 7-year-old DiGeorge syndrome patient with mediastinal mass shadow on her plain chest x-ray. She visited Severance Children's Hospital hospital with recurrent pneumonia, and throughout her repeated chest x-ray, there was a mass like shadow on anterior mediastinal area. We did full evaluation including chest computed tomography, chest ultrasonography, and chest magnetic resonance imaging. To rule out malignancy, video assisted thoracoscopic surgery was done. Final diagnosis of the mass which was thought to be malignancy, was lymphoproliferative lesion

    Incidence and Risk Factors Associated with Superior Mesenteric Artery Syndrome following Surgical Correction of Scoliosis

    Get PDF
    STUDY DESIGN: Retrospective study. PURPOSE: To more accurately determine the incidence and clarify risk factors. OVERVIEW OF LITERATURE: Superior mesenteric artery syndrome is one of the possible complications following correctional operation for scoliosis. However, when preliminary symptoms are vague, the diagnosis of superior mesenteric artery syndrome may be easily missed. METHODS: We conducted a retrospective study using clinical data from 118 patients (43 men and 75 women) who underwent correctional operations for scoliosis between September 2001 and August 2007. The mean patient age was 15.9 years (range 9~24 years). The risk factors under scrutiny were the patient body mass index (BMI), change in Cobb's angle, and trunk length. RESULTS: The incidence of subjects confirmed to have obstruction was 2.5%. However, the rate increased to 7.6% with the inclusion of the 6 subjects who only showed clinical symptoms of obstruction without confirmative study. The BMI for the asymptomatic and symptomatic groups were 18.4+/-3.4 and 14.6+/-3, respectively. The change in Cobb's angle for the asymptomatic and symptomatic groups were 24.8+/-13.6 degrees and 23.4+/-9.1 degrees , respectively. The change in trunk length for the asymptomatic and symptomatic groups were 2.3+/-2.1 cm and 4.5+/-4.8 cm, respectively. Differences in Cobb's angle and the change in trunk length between the two groups did not reach statistical significance, although there was a greater increase in trunk length for the symptomatic group than for the asymptomatic group. CONCLUSIONS: Our study shows that the incidence of superior mesenteric artery syndrome may be greater than the previously accepted rate of 4.7%. Therefore, in the face of any early signs or symptoms of superior mesenteric artery syndrome, prompt recognition and treatment are necessaryope

    Quantifying the performance and energy efficiency of advanced cache indexing for GPGPU computing

    No full text
    To achieve higher performance and energy efficiency, GPGPU architectures have recently begun to employ hardware caches. Adding caches to GPGPUs, however, does not always guarantee improved performance and energy efficiency due to the thrashing in small caches shared by thousands of threads. While prior work has proposed warp-scheduling and cache-bypassing techniques to address this issue, relatively little work has been done in the context of advanced cache indexing (ACI). To bridge this gap, this work investigates the effectiveness of ACI for high-performance and energy efficient GPGPU computing. We discuss the design and implementation of static and adaptive cache indexing schemes for GPGPUs. We then quantify the effectiveness of the ACI schemes based on a cycle accurate GPGPU simulator. Our quantitative evaluation demonstrates that the ACI schemes are effective in that they provide significant performance and energy-efficiency gains over the conventional indexing scheme. Further, we investigate the performance sensitivity of ACI to key architectural parameters (e.g., indexing latency and cache associativity). Our experimental results show that the ACI schemes are promising in that they continue to provide significant performance gains even when additional indexing latency occurs due to the hardware complexity and the baseline cache is enhanced with high associativity or large capacity. (C) 2016 Elsevier B.V. All rights reservedclos

    BLPP: Improving the Performance of GPGPUs with Heterogeneous Memory through Bandwidth- and Latency-Aware Page Placement

    No full text
    GPGPUs with heterogeneous memory have surfaced as a promising solution to improve the programmability and flexibility of GPGPU computing. Despite the extensive prior works, relatively little work has been done to investigate holistic system software support for heterogeneity-aware memory management. To bridge this gap, we propose bandwidth- and latencyaware page placement (BLPP) for GPGPUs with heterogeneous memory. BLPP dynamically places pages across the heterogeneous memory nodes by preserving the optimal allocation ratio computed based on their performance characteristics. Our experimental results show that BLPP considerably outperforms the state-of-the-art technique and performs similarly to the staticbest version, which requires extensive offline profiling

    On the Feasibility of Advanced Cache Indexing for High-Performance and Energy-Efficient GPGPU Computing

    No full text
    To achieve higher performance and energy efficiency, GPGPU architectures have recently begun to employ hardware caches. Adding hardware caches to GPGPUs, however, does not automatically guarantee improved performance and energy efficiency due to the thrashing in small hardware caches shared by thousands of threads. While prior work has proposed warp scheduling and cache bypassing techniques to address this issue, relatively little work has been done in the context of advanced cache indexing. To bridge this gap, this work investigates the feasibility of advanced cache indexing for high-performance and energy-efficient GPGPU computing. We first discuss the design and implementation of static and adaptive cache indexing schemes for GPGPUs. We then quantify the effectiveness of the advanced indexing schemes using GPGPU benchmarks. Our quantitative evaluation demonstrates that the advanced cache indexing schemes are promising in that they significantly outperform the conventional cache indexing scheme. In addition, for a subset of cache-sensitive benchmarks, the adaptive indexing scheme substantially outperforms the static indexing scheme by effectively identifying and utilizing high-quality indexing bits based on runtime information. Finally, our evaluation shows that the effectiveness of advanced cache indexing is sensitive to different warp schedulers, motivating further research on coordinated cache indexing and warp scheduling techniques
    corecore