136 research outputs found

    Get Out of the Valley: Power-Efficient Address Mapping for GPUs

    Get PDF
    GPU memory systems adopt a multi-dimensional hardware structure to provide the bandwidth necessary to support 100s to 1000s of concurrent threads. On the software side, GPU-compute workloads also use multi-dimensional structures to organize the threads. We observe that these structures can combine unfavorably and create significant resource imbalance in the memory subsystem causing low performance and poor power-efficiency. The key issue is that it is highly application-dependent which memory address bits exhibit high variability. To solve this problem, we first provide an entropy analysis approach tailored for the highly concurrent memory request behavior in GPU-compute workloads. Our window-based entropy metric captures the information content of each address bit of the memory requests that are likely to co-exist in the memory system at runtime. Using this metric, we find that GPU-compute workloads exhibit entropy valleys distributed throughout the lower order address bits. This indicates that efficient GPU-address mapping schemes need to harvest entropy from broad address-bit ranges and concentrate the entropy into the bits used for channel and bank selection in the memory subsystem. This insight leads us to propose the Page Address Entropy (PAE) mapping scheme which concentrates the entropy of the row, channel and bank bits of the input address into the bank and channel bits of the output address. PAE maps straightforwardly to hardware and can be implemented with a tree of XOR-gates. PAE improves performance by 1.31 x and power-efficiency by 1.25 x compared to state-of-the-art permutation-based address mapping

    FHPM: Fine-grained Huge Page Management For Virtualization

    Full text link
    As more data-intensive tasks with large footprints are deployed in virtual machines (VMs), huge pages are widely used to eliminate the increasing address translation overhead. However, once the huge page mapping is established, all the base page regions in the huge page share a single extended page table (EPT) entry, so that the hypervisor loses awareness of accesses to base page regions. None of the state-of-the-art solutions can obtain access information at base page granularity for huge pages. We observe that this can lead to incorrect decisions by the hypervisor, such as incorrect data placement in a tiered memory system and unshared base page regions when sharing pages. This paper proposes FHPM, a fine-grained huge page management for virtualization without hardware and guest OS modification. FHPM can identify access information at base page granularity, and dynamically promote and demote pages. A key insight of FHPM is to redirect the EPT huge page directory entries (PDEs) to new companion pages so that the MMU can track access information within huge pages. Then, FHPM can promote and demote pages according to the current hot page pressure to balance address translation overhead and memory usage. At the same time, FHPM proposes a VM-friendly page splitting and collapsing mechanism to avoid extra VM-exits. In combination, FHPM minimizes the monitoring and management overhead and ensures that the hypervisor gets fine-grained VM memory accesses to make the proper decision. We apply FHPM to improve tiered memory management (FHPM-TMM) and to promote page sharing (FHPM-Share). FHPM-TMM achieves a performance improvement of up to 33% and 61% over the pure huge page and base page management. FHPM-Share can save 41% more memory than Ingens, a state-of-the-art page sharing solution, with comparable performance

    Optimizing plant density and nitrogen application to manipulate tiller growth and increase grain yield and nitrogen-use efficiency in winter wheat

    Get PDF
    The growth of wheat tillers and plant nitrogen-use efficiency (NUE) will gradually deteriorate in response to high plant density and over-application of N. Therefore, in this study, a 2-year field study was conducted with three levels of plant densities (75 ×104plants ha−1, D1; 300 ×104plants ha−1, D2; 525 ×104plants ha−1, D3) and three levels of N application rates (120 kg N ha−1, N1; 240 kg N ha−1, N2; 360 kg N ha−1, N3) to determine how to optimize plant density and N application to regulate tiller growth and to assess the contribution of such measures to enhancing grain yield (GY) and NUE. The results indicated that an increase in plant density significantly increased the number of superior tillers and the number of spikes per m2(SN), resulting in a higher GY and higher partial factor productivity of applied N (PFPN). However, there was no significant difference in GY and PFPN between plant densities D2 and D3. Increasing the N application rate significantly increased the vascular bundle number (NVB) and area (AVB), however, excess N application (N3) did not significantly improve these parameters. N application significantly increased GY, whereas there was a significant decrease in PFPN in response to an increase in N application rate. The two years results suggested that increasing the plant density (from 75 ×104plants ha−1to 336 ×104plants ha−1) in conjunction with the application of 290 kg N ha−1N will maximize GY, and also increase PFPN(39.7 kg kg−1), compared with the application of 360 kg N ha−1N. Therefore, an appropriate combination of increased planting density with reduced N application could regulate tiller number and favor the superior tiller group, to produce wheat populations with enhanced yield and NUE

    Vapour-phase-transport rearrangement technique for the synthesis of new zeolites

    Get PDF
    M.S., M.M., R.E.M., J.Č. and M.O. acknowledge OP VVV “Excellent Research Teams” project No.CZ.02.1.01/0.0/0.0/15_003/0000417– CUCAM. M.S. and M.O. thank the Primus Research Program of the Charles University (project number PRIMUS/17/SCI/22 “Soluble zeolites”). R.E.M. also thanks the ERC (Advanced Grant 787073 “ADOR”). A.M. acknowledges The Centre for High-resolution Electron Microscopy (CħEM), supported by SPST of ShanghaiTech University under contract No. EM02161943, and the Natural National Science Foundation of China, through projects NFSC-21850410448 and NSFC- 21835002. Z.L. acknowledges the support from the National Key Research and Development Program of China (2016YFA0300102) and the National Natural Science Foundation of China (11675179, 11434009). J.Č. acknowledges the support of the Czech Science Foundation to the project EXPRO (19-27551×).Owing to the significant difference in the numbers of simulated and experimentally feasible zeolite structures, several alternative strategies have been developed for zeolite synthesis. Despite their rationality and originality, most of these techniques are based on trial-and-error, which makes it difficult to predict the structure of new materials. Assembly-Disassembly-Organization-Reassembly (ADOR) method overcoming this limitation was successfully applied to a limited number of structures with relatively stable crystalline layers ( UTL , UOV , *CTH ). Here, we report a straightforward, vapour-phase-transport strategy for the transformation of IWW zeolite with low-density silica layers connected by labile Ge-rich units into material with new topology. In situ XRD and XANES studies on the mechanism of IWW rearrangement reveal an unusual structural distortion-reconstruction of the framework throughout the process. Therefore, our findings provide a step forward towards engineering nanoporous materials and increasing the number of zeolites available for future applications.Publisher PDFPeer reviewe
    corecore