17 research outputs found
Heap Reference Analysis Using Access Graphs
Despite significant progress in the theory and practice of program analysis,
analysing properties of heap data has not reached the same level of maturity as
the analysis of static and stack data. The spatial and temporal structure of
stack and static data is well understood while that of heap data seems
arbitrary and is unbounded. We devise bounded representations which summarize
properties of the heap data. This summarization is based on the structure of
the program which manipulates the heap. The resulting summary representations
are certain kinds of graphs called access graphs. The boundedness of these
representations and the monotonicity of the operations to manipulate them make
it possible to compute them through data flow analysis.
An important application which benefits from heap reference analysis is
garbage collection, where currently liveness is conservatively approximated by
reachability from program variables. As a consequence, current garbage
collectors leave a lot of garbage uncollected, a fact which has been confirmed
by several empirical studies. We propose the first ever end-to-end static
analysis to distinguish live objects from reachable objects. We use this
information to make dead objects unreachable by modifying the program. This
application is interesting because it requires discovering data flow
information representing complex semantics. In particular, we discover four
properties of heap data: liveness, aliasing, availability, and anticipability.
Together, they cover all combinations of directions of analysis (i.e. forward
and backward) and confluence of information (i.e. union and intersection). Our
analysis can also be used for plugging memory leaks in C/C++ languages.Comment: Accepted for printing by ACM TOPLAS. This version incorporates
referees' comment
TLB pre-loading for Java applications
The increasing memory requirement for today\u27s applications is causing more stress for the memory system. This side effect puts pressure into available caches, and specifically the TLB cache. TLB misses are responsible for a considerable ratio of the total memory latency, since an average of 10% of execution time is wasted on miss penalties. Java applications are not in a better position. Their attractive features increase the memory footprint. Generally, Java applications TLB miss rate tends to be multiples of miss rate for non-java applications. The high miss rate will cause the application to loose valuable execution time. Our experiments show that on average, miss penalty can constitute about 24% of execution time. Several hardware modifications were suggested to reduce TLB misses for general applications. However, to the best of our knowledge, there have been no similar efforts for Java applications. Here we propose a software-based prediction model that relies on information available to the virtual machine. The model uses the write barrier operation to predict TLB misses with an average 41% accuracy rate
Recommended from our members
Pointer Analysis in the Presence of Dynamic Class Loading ; CU-CS-966-03
Topology-Aware Parallelism for NUMA Copying Collectors
Abstract. NUMA-aware parallel algorithms in runtime systems attempt to improve locality by allocating memory from local NUMA nodes. Re-searchers have suggested that the garbage collector should profile mem-ory access patterns or use object locality heuristics to determine the tar-get NUMA node before moving an object. However, these solutions are costly when applied to every live object in the reference graph. Our earlier research suggests that connected objects represented by the rooted sub-graphs provide abundant locality and they are appropriate for NUMA architecture. In this paper, we utilize the intrinsic locality of rooted sub-graphs to improve parallel copying collector performance. Our new topology-aware parallel copying collector preserves rooted sub-graph integrity by moving the connected objects as a unit to the target NUMA node. In addition, it distributes and assigns the copying tasks to appropriate (i.e. NUMA node local) GC threads. For load balancing, our solution enforces locality on the work-stealing mechanism by stealing from local NUMA nodes only. We evaluated our approach on SPECjbb2013, DaCapo 9.12 and Neo4j. Results show an improvement in GC performance by up to 2.5x speedup and 37 % better application performance
Recommended from our members
Intelligent Memory Management Heuristics
Automatic memory management is crucial in implementation of runtime systems even though it induces a significant computational overhead. In this thesis I explore the use of statistical properties of the directed graph describing the set of live data to decide between garbage collection and heap expansion in a memory management algorithm combining the dynamic array represented heaps with a mark and sweep garbage collector to enhance its performance. The sampling method predicting the density and the distribution of useful data is implemented as a partial marking algorithm. The algorithm randomly marks the nodes of the directed graph representing the live data at different depths with a variable probability factor p. Using the information gathered by the partial marking algorithm in the current step and the knowledge gathered in the previous iterations, the proposed empirical formula predicts with reasonable accuracy the density of live nodes on the heap, to decide between garbage collection and heap expansion. The resulting heuristics are tested empirically and shown to improve overall execution performance significantly in the context of the Jinni Prolog compiler's runtime system
Muistin siivous
Tutkielmassa esitellään roskan käsite tietojenkäsittelytieteessä, roskienkeruun keskeiset käsitteet ja perusmenetelmät muunnelmineen sekä nykyaikaiset tehokkaat algoritmit. Keskipisteenä ovat kuitenkin muistinhallintatutkimuksen 2000-luvun saavutukset, tutkimusaiheet ja tutkimusvälineet. Näitä hyödyntää tutkielmassa esiteltävä uusi CBRC-roskienkeruualgoritmi. Lisäksi katsastetaan ohjelmoijan vastuu automaattisessa muistinhallinnassa sekä ohjelmoinnissa käytettävissä olevat roskienkeruutietoiset välineet eräissä ohjelmointikielissä ja –ympäristöissä (Java, .Net, C++).
Avainsanat ja -sanonnat: roskienkeruu, muistinsiivous, muistinhallinta, algoritmit, ohjelmointikielet
CR-luokat: D 3.4, D.4.2, D.3.
가상머신의 메모리 관리 최적화
학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 문수묵.Memory management is one of key components in virtual machine and also affects overall performance of virtual machine itself. Modern programming languages for virtual machine use dynamic memory allocation and objects are allocated dynamically to heap at a higher rate, such as Java. These allocated objects are reclaimed later when objects are not used anymore to secure free room in the heap for future objects allocation. Many virtual machines adopt garbage collection technique to reclaim dead objects in the heap. The heap can be also expanded itself to allocate more objects instead. Therefore overall performance of memory management is determined by object allocation technique, garbage collection and heap management technique. In this paper, three optimizing techniques are proposed to improve overall performance of memory management in virtual machine. First, a lazy-worst-fit object allocator is suggested to allocate small objects with little overhead in virtual machine which has a garbage collector. Then a biased allocator is proposed to improve the performance of garbage collector itself by reducing extra overhead of garbage collector. Finally an ahead-of-time heap expansion technique is suggested to improve user responsiveness as well as overall performance of memory management by suppressing invocation of garbage collection. Proposed optimizations are evaluated in various devices including desktop, embedded and mobile, with different virtual machines including Java virtual machine for Java runtime and Dalvik virtual machine for Android platform. A lazy-worst-fit allocator outperform other allocators including first-fit and lazy-worst-fit allocator and shows good fragmentation as low as rst-t allocator which is known to have the lowest fragmentation. A biased allocator reduces 4.1% of pause time caused by garbage collections in average. Ahead-of-time heap expansion reduces both number of garbage collections and total pause time of garbage collections. Pause time of GC reduced up to 31% in default applications of Android platform.Abstract i
Contents iii
List of Figures vi
List of Tables viii
Chapter 1 Introduction 1
1.1 The need of optimizing memory management . . . . . . . . . . . 2
1.2 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . 3
Chapter 2 Backgrounds 4
2.1 Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Memory management in virtual machine . . . . . . . . . . . . . . 5
Chapter 3 Lazy Worst Fit Allocator 7
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Allocation with fits . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Lazy fits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.1 Lazy worst fit . . . . . . . . . . . . . . . . . . . . . . . . . 13
iii
3.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.1 LWF implementation in the LaTTe Java virtual machine 14
3.4.2 Experimental environment . . . . . . . . . . . . . . . . . . 16
3.4.3 Performance of LWF . . . . . . . . . . . . . . . . . . . . . 17
3.4.4 Fragmentation of LWF . . . . . . . . . . . . . . . . . . . . 20
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Chapter 4 Biased Allocator 24
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3 Biased allocator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3.1 When to choose an allocator . . . . . . . . . . . . . . . . 28
4.3.2 How to choose an allocator . . . . . . . . . . . . . . . . . 30
4.4 Analyses and implementation . . . . . . . . . . . . . . . . . . . . 32
4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.5.1 Total pause time of garbage collections . . . . . . . . . . . 36
4.5.2 Eect of each analysis . . . . . . . . . . . . . . . . . . . . 38
4.5.3 Pause time of each garbage collection . . . . . . . . . . . 38
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Chapter 5 Ahead-of-time Heap Management 42
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3 Android . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3.1 Garbage Collection . . . . . . . . . . . . . . . . . . . . . . 48
5.3.2 Heap expansion heuristic . . . . . . . . . . . . . . . . . . 49
5.4 Ahead-of-time heap expansion . . . . . . . . . . . . . . . . . . . . 51
5.4.1 Spatial heap expansion . . . . . . . . . . . . . . . . . . . . 53
iv
5.4.2 Temporal heap expansion . . . . . . . . . . . . . . . . . . 55
5.4.3 Launch-time heap expansion . . . . . . . . . . . . . . . . 56
5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.5.1 Spatial heap expansion . . . . . . . . . . . . . . . . . . . . 58
5.5.2 Comparision of spatial heap expansion . . . . . . . . . . . 61
5.5.3 Temporal heap expansion . . . . . . . . . . . . . . . . . . 70
5.5.4 Launch-time heap expansion . . . . . . . . . . . . . . . . 72
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Chapter 6 Conculsion 74
Bibliography 75
요약 84
Acknowledgements 86Docto