Search CORE

45 research outputs found

FHPM: Fine-grained Huge Page Management For Virtualization

Author: Li Chuandong
Luo Yingwei
Sha Sai
Wang Xiaolin
Wang Zhenlin
Yang Xiran
Zeng Yangqing
Publication venue
Publication date: 20/07/2023
Field of study

As more data-intensive tasks with large footprints are deployed in virtual machines (VMs), huge pages are widely used to eliminate the increasing address translation overhead. However, once the huge page mapping is established, all the base page regions in the huge page share a single extended page table (EPT) entry, so that the hypervisor loses awareness of accesses to base page regions. None of the state-of-the-art solutions can obtain access information at base page granularity for huge pages. We observe that this can lead to incorrect decisions by the hypervisor, such as incorrect data placement in a tiered memory system and unshared base page regions when sharing pages. This paper proposes FHPM, a fine-grained huge page management for virtualization without hardware and guest OS modification. FHPM can identify access information at base page granularity, and dynamically promote and demote pages. A key insight of FHPM is to redirect the EPT huge page directory entries (PDEs) to new companion pages so that the MMU can track access information within huge pages. Then, FHPM can promote and demote pages according to the current hot page pressure to balance address translation overhead and memory usage. At the same time, FHPM proposes a VM-friendly page splitting and collapsing mechanism to avoid extra VM-exits. In combination, FHPM minimizes the monitoring and management overhead and ensures that the hypervisor gets fine-grained VM memory accesses to make the proper decision. We apply FHPM to improve tiered memory management (FHPM-TMM) and to promote page sharing (FHPM-Share). FHPM-TMM achieves a performance improvement of up to 33% and 61% over the pure huge page and base page management. FHPM-Share can save 41% more memory than Ingens, a state-of-the-art page sharing solution, with comparable performance

arXiv.org e-Print Archive

A neural network model for cache and memory prediction of neural networks

Author: Luo Yingwei
Sha Sai
Wang Xiaolin
Wang Zhenlin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/03/2019
Field of study

Neural networks have been widely applied to various research and production fields. However, most recent research is focused on the establishment and selection of a specific neural network model. Less attention is paid to their system overhead despite of their massive computing and storage resource demand. This research focuses on a relatively new research direction that models the system-level memory and cache demand of neural networks. We utilize a neural network to learn and predict hit ratio curve and memory footprint of neural networks with their hyper-parameters as input. The prediction result is used to drive cache partitioning and memory partitioning to optimize co-execution of multiple neural networks. To demonstrate effectiveness of our approach, we model four common networks, BP neural network, convolutional neural network, recurrent neural network, and autoencoder. We investigate the influence of hyper-parameters of each model on the last level cache and memory demand. We resort to the BP algorithm as the learning tool to predict last level cache hit ratio curve and memory usage. Our experimental results show that cache and memory allocation schemes guided by our prediction optimize for a wide range of performance targets

Michigan Technological University

Crossref

vTMM: Tiered Memory Management for Virtual Machines

Author: Li Chuandong
Luo Yingwei
Sha Sai
Wang Xiaolin
Wang Zhenlin
Publication venue: Digital Commons @ Michigan Tech
Publication date: 01/05/2023
Field of study

The memory demand of virtual machines (VMs) is increasing, while the traditional DRAM-only memory system has limited capacity and high power consumption. The tiered memory system can effectively expand the memory capacity and increase the cost efficiency. Virtualization introduces new challenges for memory tiering, specifically enforcing performance isolation, minimizing context switching, and providing resource overcommit. However, none of the state-of-the-art designs consider virtualization and thus address these challenges; we observe that a VM with tiered memory incurs up to a 2× slowdown compared to a DRAM-only VM. This paper proposes vTMM, a tiered memory management system specifically designed for virtualization. vTMM automatically determines page hotness and migrates pages between fast and slow memory to achieve better performance. A key insight in vTMM is to leverage the unique system characteristics in virtualization to meet the above challenges. Specifically, vTMM tracks memory accesses with page-modification logging (PML) and a multi-level queue design. Next, vTMM quantifies the page “temperature” and makes a fine-grained page classification with bucket-sorting. vTMM performs page migration with PML while providing resource overcommit by transparently resizing VM memory through the two-dimensional page tables. In combination, the above techniques minimize overhead, ensure performance isolation and provide dynamic memory partitioning to improve the overall system performance

Michigan Technological University

Software-Based Flat Nested Page Table in Sunway Architecture

Author: Du Hanlin
Luo Yingwei
Sha Sai
Wang Xiaolin
Wang Zhenlin
Publication venue: Digital Commons @ Michigan Tech
Publication date: 01/04/2022
Field of study

The nested page table (NPT) model is an effective, hardware-assisted memory virtualization solution. However, the current Sunway processor lacks hardware support of NPT. However, the privileged programmable interface of Sunway architecture can be used to emulate the necessary hardware support with software. Hardware mode is the CPU privilege level unique to Sunway. This interface runs on the Sunway hardware mode with the highest CPU privileged level. In this paper, we propose the software-based flat nested page table (swFNPT) model for Sunway. In the programmable interface, we software-implement the hardware functions required by the nested page table model, such as nested page table walking. The new design makes up for the deficiency in hardware support through software optimization. In particular, the flat (one-level) nested page table is used to improve the efficiency of page walk. We use multiple benchmarks to test the performance of swFNPT. The experiments on a Sunway 1621 server show the promising performance of swFNPT. The average memory virtualization overhead of SPEC CPU 2006 is about 3% and the average overhead for SPEC CPU 2017 benchmarks with large working set is about 4%. The STREAM result shows that the memory bandwidth loss of swFNPT is less than 3%. Therefore, this paper provides a valuable reference for future development of hardware-assisted virtualization of Sunway server

Michigan Technological University

Accelerating Address Translation for Virtualization by Leveraging Hardware Mode

Author: Luo Yingwei
Sha Sai
Wang Xiaolin
Wang Zhenlin
Zhang Yi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/01/2022
Field of study

The overhead of memory virtualization remains nontrivial. The traditional shadow paging (TSP) resorts to a shadow page table (SPT) to achieve the native page walk speed, but page table updates require hypervisor interventions. Alternatively, nested paging enables low-overhead page table updates, but utilizes the hardware MMU to perform a long-latency two-dimensional page walk. This paper proposes new memory virtualization solutions based on hardware (machine) modethe highest CPU privilege level in some architectures like Sunway and RISC-V. A programming interface, running in hardware mode, enables software-implementation of hardware support functions. We first propose Software-based Nested Paging (SNP), which extends the software MMU to perform a two-dimensional page walk in hardware mode. Secondly, we present Swift Shadow Paging (SSP), which accomplishes page table synchronization by intercepting TLB flushing in hardware mode. Finally, we propose Accelerated Shadow Paging (ASP) combining SSP and SNP. ASP handles the last-level SPT page faults by walking two-dimensional page tables in hardware mode, which eliminates most hypervisor interventions. This paper systematically compares multiple memory virtualization models by analyzing their designs and evaluating their performance both on a real system and a simulator. The experiments show that the virtualization overhead of ASP is less than 4.5% for all workloads

Michigan Technological University

Swift shadow paging (SSP): No write-protection but following TLB flushing

Author: Luo Yingwei
Sha Sai
Wang Xiaolin
Wang Zhenlin
Zhang Yi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2021
Field of study

Virtualization is a key technique for supporting cloud services and memory virtualization is a major component of virtualization technology. Common memory virtualization mechanisms include shadow paging and hardware-Assisted paging. The shadow paging model needs to synchronize shadow/guest page tables whenever there is a guest page table update. In the design of traditional shadow paging (TSP), the guest page table pages are write-protected so the updates can be intercepted by the hypervisor to ensure synchronization. Frequent page table updates cause lots of VM Exits. Researchers have developed hardware-Assisted paging to eliminate this overhead. However, address translation needs to walk a two-dimensional page table. This design significantly increases the overhead of page walk. This paper proposes SSP, a Swift Shadow Paging model which leverages the privileged hardware mode. In this design, the write protection mechanism is no longer needed. Rather, SSP accomplishes lazy page table synchronization by intercepting TLB flushing, which must be initiated by the guest OS when there is a page table update. The hardware mode, such as RISC-V\u27s machine mode and Sunway\u27s hardware mode, with the highest privilege, opens a new door for communication between the host OS and a guest OS. In addition, by using a shadow page table base address buffer, SSP eliminates the VM Exits generated by guest process context switching. SSP inherits the advantage of TSP as it remains as a software-only solution and does not incur the excessive overhead of page walk when compared to hardware-Assisted paging. We implement SSP in a Sunway machine. Our evaluation demonstrates SSP\u27s advantage for multiple workloads. Compared with TSP, SSP reduces VM_Exits caused by memory virtualization by 23%-56%. And the virtualization overhead of SSP is less than 5.5% for all workloads

Michigan Technological University

Working set size estimation with hugepages in virtualization

Author: Bai Xiaokuang
Hu Jingyuan
Luo Yingwei
Sha Sai
Wang Xiaolin
Wang Zhenlin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/03/2019
Field of study

With the rapid increase of data set size of cloud and big data applications, conventional regular 4KB pages can cause high pressure on hardware address translations. The pressure becomes more prominent in a virtualized system, which adds an additional layer of address translation. Virtual to physical address translations reply on a hardware Translation Lookaside Buffer (TLB) to cache address mappings. However, even modern hardware offers a very limited number of TLB entries. Meanwhile, TLB misses can cause significant performance degradation. Using 2MB or 1GB hugepages can improve TLB coverage and reduce TLB miss penalty. Therefore, recent operation systems, such as Linux, start to adopt hugepages. However, using hugepages bring new challenges, among which is working set size prediciton. In a virtualized system, working set size (WSS) estimation, which predicts the actual memory demand of a virtual machine, is often applied to guide virtual machine memory management and memory allocation. We find that traditional WSS estimation methods with regular pages cannot be simply ported to a system adopting hugepages. We estimate the working set size of a virtual machine by constructing a miss ratio curve (MRC), which relates page miss ratio to the virtual machine memory allocation. Using hugepages increases the overhead to track page accesses for MRC construction and also demands much higher precision in representing the miss ratios as a hugepage miss leads to a much higher penalty than a regular page miss. In this paper, we propose an accurate WSS estimation method in a virtual execution environment with hugepages. We design and implement a low-overhead dynamic memory tracking mechanism by utilizing a hot set to filter frequent short-reuse accesses. Our approach is able to output a hugepage miss ratio at high precision. The experimental results show that our method can predict WSS accurately with an average overhead of 1.5%

Michigan Technological University

Crossref

HUB: Hugepage ballooning in kernel-based virtual machines

Author: Bai Xiaokuang
Hu Jingyuan
Luo Yingwei
Sha Sai
Wang Xiaolin
Wang Zhenlin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2018
Field of study

Modern applications running on cloud data centers often consume a large amount of memory and their memory demands can vary during execution. Dynamic memory allocation is a necessity for high memory utilization. For a large dataset application, using hugepages instead of regular 4KB pages can efficiently reduce memory access and memory management overhead and improve overall performance. Virtualization, which is widely applied in data centers for server consolidation, brings new challenges to manage memory dynamically and effectively, especially for hugepages. In a virtualized system, ballooning is a popular mechanism used to dynamically adjust memory allocations for co-located virtual machines. We observe that the current Linux Kernel-Based Virtual Machine (KVM) does not support huge page ballooning. An application that can benefit from hugepages often loses its performance advantage when the guest OS experiences memory ballooning. This paper presents design and implementation of HUB, a HUgepage Ballooning mechanism in KVM which can dispatch memory in the granularity of hugepages. The experimental results show that our approach significantly reduces TLB misses and improves the overall performance for those applications with large memory demand

Michigan Technological University

Crossref

Huge Page Friendly Virtualized Memory Management

Author: Hu Jing Yuan
Luo Ying Wei
Sha Sai
Wang Xiao Lin
Wang Zhenlin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2020
Field of study

With the rapid increase of memory consumption by applications running on cloud data centers, we need more efficient memory management in a virtualized environment. Exploiting huge pages becomes more critical for a virtual machine’s performance when it runs large working set size programs. Programs with large working set sizes are more sensitive to memory allocation, which requires us to quickly adjust the virtual machine’s memory to accommodate memory phase changes. It would be much more efficient if we could adjust virtual machines’ memory at the granularity of huge pages. However, existing virtual machine memory reallocation techniques, such as ballooning, do not support huge pages. In addition, in order to drive effective memory reallocation, we need to predict the actual memory demand of a virtual machine. We find that traditional memory demand estimation methods designed for regular pages cannot be simply ported to a system adopting huge pages. How to adjust the memory of virtual machines timely and effectively according to the periodic change of memory demand is another challenge we face. This paper proposes a dynamic huge page based memory balancing system (HPMBS) for efficient memory management in a virtualized environment. We first rebuild the ballooning mechanism in order to dispatch memory in the granularity of huge pages. We then design and implement a huge page working set size estimation mechanism which can accurately estimate a virtual machine’s memory demand in huge pages environments. Combining these two mechanisms, we finally use an algorithm based on dynamic programming to achieve dynamic memory balancing. Experiments show that our system saves memory and improves overall system performance with low overhead

Michigan Technological University

Current Status of Herbal Medicines in Chronic Liver Disease Therapy: The Biological Effects, Molecular Targets and Future Prospects

Author: Hor Yue Tan
Ming Hong
Ning Wang
Sai-Wah Tsao
Sha Li
Yibin Feng
Publication venue: MDPI AG
Publication date: 01/12/2015
Field of study

Chronic liver dysfunction or injury is a serious health problem worldwide. Chronic liver disease involves a wide range of liver pathologies that include fatty liver, hepatitis, fibrosis, cirrhosis, and hepatocellular carcinoma. The efficiency of current synthetic agents in treating chronic liver disease is not satisfactory and they have undesirable side effects. Thereby, numerous medicinal herbs and phytochemicals have been investigated as complementary and alternative treatments for chronic liver diseases. Since some herbal products have already been used for the management of liver diseases in some countries or regions, a systematic review on these herbal medicines for chronic liver disease is urgently needed. Herein, we conducted a review describing the potential role, pharmacological studies and molecular mechanisms of several commonly used medicinal herbs and phytochemicals for chronic liver diseases treatment. Their potential toxicity and side effects were also discussed. Several herbal formulae and their biological effects in chronic liver disease treatment as well as the underlying molecular mechanisms are also summarized in this paper. This review article is a comprehensive and systematic analysis of our current knowledge of the conventional medicinal herbs and phytochemicals in treating chronic liver diseases and on the potential pitfalls which need to be addressed in future study

Directory of Open Access Journals