14 research outputs found

    The economics of garbage collection

    Get PDF
    This paper argues that economic theory can improve our understanding of memory management. We introduce the allocation curve, as an analogue of the demand curve from microeconomics. An allocation curve for a program characterises how the amount of garbage collection activity required during its execution varies in relation to the heap size associated with that program. The standard treatment of microeconomic demand curves (shifts and elasticity) can be applied directly and intuitively to our new allocation curves. As an application of this new theory, we show how allocation elasticity can be used to control the heap growth rate for variable sized heaps in Jikes RVM

    Subheap-Augmented Garbage Collection

    Get PDF
    Automated memory management avoids the tedium and danger of manual techniques. However, as no programmer input is required, no widely available interface exists to permit principled control over sometimes unacceptable performance costs. This dissertation explores the idea that performance-oriented languages should give programmers greater control over where and when the garbage collector (GC) expends effort. We describe an interface and implementation to expose heap partitioning and collection decisions without compromising type safety. We show that our interface allows the programmer to encode a form of reference counting using Hayes\u27 notion of key objects. Preliminary experimental data suggests that our proposed mechanism can avoid high overheads suffered by tracing collectors in some scenarios, especially with tight heaps. However, for other applications, the costs of applying subheaps---in human effort and runtime overheads---remain daunting

    Crystal gazer : profile-driven write-rationing garbage collection for hybrid memories

    Get PDF
    Non-volatile memories (NVM) offer greater capacity than DRAM but suffer from high latency and low write endurance. Hybrid memories combine DRAM and NVM to form scalable memory systems with the promise of high capacity, low energy consumption, and high endurance. Automatically managing hybrid NVM-DRAM memories to achieve their promise without changing user applications or their programming models remains an open question. This paper uses garbage collection in managed languages to exploit NVM capacity while preventing NVM wear out in hybrid memories with no changes to the programming model. We introduce profile-driven write-rationing garbage collection. Allocation sites that produce frequently written objects are predicted based on previous program executions. Objects are initially allocated in a DRAM nursery space. The collector copies surviving nursery objects from highly written sites to a mature DRAM space and read-mostly objects to a mature NVM space.Write-intensity prediction for 15 Java benchmarks accurately places objects in the correct space, eliminating expensive object monitoring from prior write-rationing garbage collectors. Furthermore, our technique exposes a Pareto tradeoff between DRAM usage and NVM lifetime, unlike prior work. Experimental results on NUMA hardware that emulates hybrid NVM-DRAM memory demonstrates that profile-driven write-rationing garbage collection reduces the number of writes to NVM compared to prior work to extend its lifetime, maximizes the use of NVM for its capacity, and achieves good performance

    Ramasse-miettes générationnel et incémental gérant les cycles et les gros objets en utilisant des frames délimités

    Get PDF
    Ces derniĂšres annĂ©es, des recherches ont Ă©tĂ© menĂ©es sur plusieurs techniques reliĂ©es Ă  la collection des dĂ©chets. Plusieurs dĂ©couvertes centrales pour le ramassage de miettes par copie ont Ă©tĂ© rĂ©alisĂ©es. Cependant, des amĂ©liorations sont encore possibles. Dans ce mĂ©moire, nous introduisons des nouvelles techniques et de nouveaux algorithmes pour amĂ©liorer le ramassage de miettes. En particulier, nous introduisons une technique utilisant des cadres dĂ©limitĂ©s pour marquer et retracer les pointeurs racines. Cette technique permet un calcul efficace de l'ensemble des racines. Elle rĂ©utilise des concepts de deux techniques existantes, card marking et remembered sets, et utilise une configuration bidirectionelle des objets pour amĂ©liorer ces concepts en stabilisant le surplus de mĂ©moire utilisĂ©e et en rĂ©duisant la charge de travail lors du parcours des pointeurs. Nous prĂ©sentons aussi un algorithme pour marquer rĂ©cursivement les objets rejoignables sans utiliser de pile (Ă©liminant le gaspillage de mĂ©moire habituel). Nous adaptons cet algorithme pour implĂ©menter un ramasse-miettes copiant en profondeur et amĂ©liorer la localitĂ© du heap. Nous amĂ©liorons l'algorithme de collection des miettes older-first et sa version gĂ©nĂ©rationnelle en ajoutant une phase de marquage garantissant la collection de toutes les miettes, incluant les structures cycliques rĂ©parties sur plusieurs fenĂȘtres. Finalement, nous introduisons une technique pour gĂ©rer les gros objets. Pour tester nos idĂ©es, nous avons conçu et implĂ©mentĂ©, dans la machine virtuelle libre Java SableVM, un cadre de dĂ©veloppement portable et extensible pour la collection des miettes. Dans ce cadre, nous avons implĂ©mentĂ© des algorithmes de collection semi-space, older-first et generational. Nos expĂ©rimentations montrent que la technique du cadre dĂ©limitĂ© procure des performances compĂ©titives pour plusieurs benchmarks. Elles montrent aussi que, pour la plupart des benchmarks, notre algorithme de parcours en profondeur amĂ©liore la localitĂ© et augmente ainsi la performance. Nos mesures de la performance gĂ©nĂ©rale montrent que, utilisant nos techniques, un ramasse-miettes peut dĂ©livrer une performance compĂ©titive et surpasser celle des ramasses-miettes existants pour plusieurs benchmarks. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Ramasse-Miettes, Machine Virtuelle, Java, SableVM

    A Measurement-Based Study of Memory Usage and Garbage Collection in a Lisp System

    Get PDF
    Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNASA / NAG 1-613National Science Foundation / MIP-860489

    Cache performance of chronological garbage collection

    Get PDF
    This thesis presents cache performance analysis of the Chronological Garbage Collection Algorithm used in LVM system. LVM is a new Logic Virtual Machine for Prolog. It adopts one stack policy for all dynamic memory requirements and cooperates with an efficient garbage collection algorithm, the Chronological Garbage Collection, to recycle space, not as a deliberate garbage collection operation, but as a natural activity of the LVM engine to gather useful objects. This algorithm combines the advantages of the traditional copying, mark-compact, generational, and incremental garbage collection schemes. In order to determine the improvement of cache performance under our garbage- collection algorithm, we developed a simulator to do trace-driven cache simulation. Direct-mapped cache and set-associative cache with different cache sizes, write policies, block sizes and set associativities are simulated and measured. A comparison of LVM and SICStus 3.1 for the same benchmarks was performed. From the simulation results, we found important factors influencing the performance of the CGC algorithm. Meanwhile, the results from the cache simulator fully support the experimental results gathered from the LVM system: the cost of CGC Is almost paid by the improved cache performance. Further, we found that the memory reference patterns of our benchmarks share the same properties: most writes are for allocation and most reads are to recently written objects. In addition, the results also showed that the write-miss policy can have a dramatic effect on the cache performance of the benchmarks and a write-validate policy gives the best performance. The comparison shows that when the input size of benchmarks is small, SICStus is about 3-8 times faster than LVM. This is an acceptable range of performance ratio for comparing a binary-code engine against a byte-code emulator. When we increase the input sizes, some benchmarks maintain this performance ratio, whereas others greatly narrow the performance gap and at certain breakthrough points perform better than their counterparts under SICStus

    A study of thread-local garbage collection for multi-core systems

    Get PDF
    With multi-processor systems in widespread use, and programmers increasingly writing programs that exploit multiple processors, scalability of application performance is more of an issue. Increasing the number of processors available to an application by a factor does not necessarily boost that application's performance by that factor. More processors can actually harm performance. One cause of poor scalability is memory bandwidth becoming saturated as processors contend with each other for memory bus use. More multi-core systems have a non-uniform memory architecture and placement of threads and the data they use is important in tackling this problem. Garbage collection is a memory load and store intensive activity, and whilst well known techniques such as concurrent and parallel garbage collection aim to increase performance with multi-core systems, they do not address the memory bottleneck problem. One garbage collection technique that can address this problem is thread-local heap garbage collection. Smaller, more frequent, garbage collection cycles are performed so that intensive memory activity is distributed. This thesis evaluates a novel thread-local heap garbage collector for Java, that is designed to improve the effectiveness of this thread-independent garbage collection

    Garbage collection optimization for non uniform memory access architectures

    Get PDF
    Cache-coherent non uniform memory access (ccNUMA) architecture is a standard design pattern for contemporary multicore processors, and future generations of architectures are likely to be NUMA. NUMA architectures create new challenges for managed runtime systems. Memory-intensive applications use the system’s distributed memory banks to allocate data, and the automatic memory manager collects garbage left in these memory banks. The garbage collector may need to access remote memory banks, which entails access latency overhead and potential bandwidth saturation for the interconnection between memory banks. This dissertation makes ïŹve signiïŹcant contributions to garbage collection on NUMA systems, with a case study implementation using the Hotspot Java Virtual Machine. It empirically studies data locality for a Stop-The-World garbage collector when tracing connected objects in NUMA heaps. First, it identiïŹes a locality richness which exists naturally in connected objects that contain a root object and its reachable set— ‘rooted sub-graphs’. Second, this dissertation leverages the locality characteristic of rooted sub-graphs to develop a new NUMA-aware garbage collection mechanism. A garbage collector thread processes a local root and its reachable set, which is likely to have a large number of objects in the same NUMA node. Third, a garbage collector thread steals references from sibling threads that run on the same NUMA node to improve data locality. This research evaluates the new NUMA-aware garbage collector using seven benchmarks of an established real-world DaCapo benchmark suite. In addition, evaluation involves a widely used SPECjbb benchmark and Neo4J graph database Java benchmark, as well as an artiïŹcial benchmark. The results of the NUMA-aware garbage collector on a multi-hop NUMA architecture show an average of 15% performance improvement. Furthermore, this performance gain is shown to be as a result of an improved NUMA memory access in a ccNUMA system. Fourth, the existing Hotspot JVM adaptive policy for conïŹguring the number of garbage collection threads is shown to be suboptimal for current NUMA machines. The policy uses outdated assumptions and it generates a constant thread count. In fact, the Hotspot JVM still uses this policy in the production version. This research shows that the optimal number of garbage collection threads is application-speciïŹc and conïŹguring the optimal number of garbage collection threads yields better collection throughput than the default policy. Fifth, this dissertation designs and implements a runtime technique, which involves heuristics from dynamic collection behavior to calculate an optimal number of garbage collector threads for each collection cycle. The results show an average of 21% improvements to the garbage collection performance for DaCapo benchmarks

    Hybrid eager and lazy evaluation for efficient compilation of Haskell

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 208-220).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.The advantage of a non-strict, purely functional language such as Haskell lies in its clean equational semantics. However, lazy implementations of Haskell fall short: they cannot express tail recursion gracefully without annotation. We describe resource-bounded hybrid evaluation, a mixture of strict and lazy evaluation, and its realization in Eager Haskell. From the programmer's perspective, Eager Haskell is simply another implementation of Haskell with the same clean equational semantics. Iteration can be expressed using tail recursion, without the need to resort to program annotations. Under hybrid evaluation, computations are ordinarily executed in program order just as in a strict functional language. When particular stack, heap, or time bounds are exceeded, suspensions are generated for all outstanding computations. These suspensions are re-started in a demand-driven fashion from the root. The Eager Haskell compiler translates Ac, the compiler's intermediate representation, to efficient C code. We use an equational semantics for Ac to develop simple correctness proofs for program transformations, and connect actions in the run-time system to steps in the hybrid evaluation strategy.(cont.) The focus of compilation is efficiency in the common case of straight-line execution; the handling of non-strictness and suspension are left to the run-time system. Several additional contributions have resulted from the implementation of hybrid evaluation. Eager Haskell is the first eager compiler to use a call stack. Our generational garbage collector uses this stack as an additional predictor of object lifetime. Objects above a stack watermark are assumed to be likely to die; we avoid promoting them. Those below are likely to remain untouched and therefore are good candidates for promotion. To avoid eagerly evaluating error checks, they are compiled into special bottom thunks, which are treated specially by the run-time system. The compiler identifies error handling code using a mixture of strictness and type information. This information is also used to avoid inlining error handlers, and to enable aggressive program transformation in the presence of error handling.by Jan-Willem Maessen.Ph.D
    corecore