Search CORE

14 research outputs found

The economics of garbage collection

Author: Brown G.
Jones R.
Lujan M.
Singer J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

This paper argues that economic theory can improve our understanding of memory management. We introduce the allocation curve, as an analogue of the demand curve from microeconomics. An allocation curve for a program characterises how the amount of garbage collection activity required during its execution varies in relation to the heap size associated with that program. The standard treatment of microeconomic demand curves (shifts and elasticity) can be applied directly and intuitively to our new allocation curves. As an application of this new theory, we show how allocation elasticity can be used to control the heap growth rate for variable sized heaps in Jikes RVM

Crossref

Kent Academic Repository

Enlighten

Subheap-Augmented Garbage Collection

Author: Karel Benjamin
Publication venue: ScholarlyCommons
Publication date: 01/01/2019
Field of study

Automated memory management avoids the tedium and danger of manual techniques. However, as no programmer input is required, no widely available interface exists to permit principled control over sometimes unacceptable performance costs. This dissertation explores the idea that performance-oriented languages should give programmers greater control over where and when the garbage collector (GC) expends effort. We describe an interface and implementation to expose heap partitioning and collection decisions without compromising type safety. We show that our interface allows the programmer to encode a form of reference counting using Hayes\u27 notion of key objects. Preliminary experimental data suggests that our proposed mechanism can avoid high overheads suffered by tracing collectors in some scenarios, especially with tight heaps. However, for other applications, the costs of applying subheaps---in human effort and runtime overheads---remain daunting

ScholarlyCommons@Penn

Crystal gazer : profile-driven write-rationing garbage collection for hybrid memories

Author: Akram Shoaib
Eeckhout Lieven
McKinley Kathryn
Sartor Jennifer
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Non-volatile memories (NVM) offer greater capacity than DRAM but suffer from high latency and low write endurance. Hybrid memories combine DRAM and NVM to form scalable memory systems with the promise of high capacity, low energy consumption, and high endurance. Automatically managing hybrid NVM-DRAM memories to achieve their promise without changing user applications or their programming models remains an open question. This paper uses garbage collection in managed languages to exploit NVM capacity while preventing NVM wear out in hybrid memories with no changes to the programming model. We introduce profile-driven write-rationing garbage collection. Allocation sites that produce frequently written objects are predicted based on previous program executions. Objects are initially allocated in a DRAM nursery space. The collector copies surviving nursery objects from highly written sites to a mature DRAM space and read-mostly objects to a mature NVM space.Write-intensity prediction for 15 Java benchmarks accurately places objects in the correct space, eliminating expensive object monitoring from prior write-rationing garbage collectors. Furthermore, our technique exposes a Pareto tradeoff between DRAM usage and NVM lifetime, unlike prior work. Experimental results on NUMA hardware that emulates hybrid NVM-DRAM memory demonstrates that profile-driven write-rationing garbage collection reduces the number of writes to NVM compared to prior work to extend its lifetime, maximizes the use of NVM for its capacity, and achieves good performance

Ghent University Academic Bibliography

Ramasse-miettes générationnel et incémental gérant les cycles et les gros objets en utilisant des frames délimités

Author: Adam Sébastien
Publication venue
Publication date: 01/01/2008
Field of study

Ces dernières années, des recherches ont été menées sur plusieurs techniques reliées à la collection des déchets. Plusieurs découvertes centrales pour le ramassage de miettes par copie ont été réalisées. Cependant, des améliorations sont encore possibles. Dans ce mémoire, nous introduisons des nouvelles techniques et de nouveaux algorithmes pour améliorer le ramassage de miettes. En particulier, nous introduisons une technique utilisant des cadres délimités pour marquer et retracer les pointeurs racines. Cette technique permet un calcul efficace de l'ensemble des racines. Elle réutilise des concepts de deux techniques existantes, card marking et remembered sets, et utilise une configuration bidirectionelle des objets pour améliorer ces concepts en stabilisant le surplus de mémoire utilisée et en réduisant la charge de travail lors du parcours des pointeurs. Nous présentons aussi un algorithme pour marquer récursivement les objets rejoignables sans utiliser de pile (éliminant le gaspillage de mémoire habituel). Nous adaptons cet algorithme pour implémenter un ramasse-miettes copiant en profondeur et améliorer la localité du heap. Nous améliorons l'algorithme de collection des miettes older-first et sa version générationnelle en ajoutant une phase de marquage garantissant la collection de toutes les miettes, incluant les structures cycliques réparties sur plusieurs fenêtres. Finalement, nous introduisons une technique pour gérer les gros objets. Pour tester nos idées, nous avons conçu et implémenté, dans la machine virtuelle libre Java SableVM, un cadre de développement portable et extensible pour la collection des miettes. Dans ce cadre, nous avons implémenté des algorithmes de collection semi-space, older-first et generational. Nos expérimentations montrent que la technique du cadre délimité procure des performances compétitives pour plusieurs benchmarks. Elles montrent aussi que, pour la plupart des benchmarks, notre algorithme de parcours en profondeur améliore la localité et augmente ainsi la performance. Nos mesures de la performance générale montrent que, utilisant nos techniques, un ramasse-miettes peut délivrer une performance compétitive et surpasser celle des ramasses-miettes existants pour plusieurs benchmarks. ______________________________________________________________________________ MOTS-CLÉS DE L’AUTEUR : Ramasse-Miettes, Machine Virtuelle, Java, SableVM

Archipel - Université du Québec à Montréal

A Measurement-Based Study of Memory Usage and Garbage Collection in a Lisp System

Author: Iyer Ravi K.
Llames Rene Lim
Publication venue: Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Publication date: 01/02/1990
Field of study

Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNASA / NAG 1-613National Science Foundation / MIP-860489

Illinois Digital Environment for Access to Learning and Scholarship Repository

Exploiting managed language semantics to optimize for hardware heterogeneity

Author: Akram Shoaib
Publication venue
Publication date: 01/01/2019
Field of study

Ghent University Academic Bibliography

Cache performance of chronological garbage collection

Author: Ding Yuping
Publication venue
Publication date: 01/01/1999
Field of study

This thesis presents cache performance analysis of the Chronological Garbage Collection Algorithm used in LVM system. LVM is a new Logic Virtual Machine for Prolog. It adopts one stack policy for all dynamic memory requirements and cooperates with an efficient garbage collection algorithm, the Chronological Garbage Collection, to recycle space, not as a deliberate garbage collection operation, but as a natural activity of the LVM engine to gather useful objects. This algorithm combines the advantages of the traditional copying, mark-compact, generational, and incremental garbage collection schemes. In order to determine the improvement of cache performance under our garbage- collection algorithm, we developed a simulator to do trace-driven cache simulation. Direct-mapped cache and set-associative cache with different cache sizes, write policies, block sizes and set associativities are simulated and measured. A comparison of LVM and SICStus 3.1 for the same benchmarks was performed. From the simulation results, we found important factors influencing the performance of the CGC algorithm. Meanwhile, the results from the cache simulator fully support the experimental results gathered from the LVM system: the cost of CGC Is almost paid by the improved cache performance. Further, we found that the memory reference patterns of our benchmarks share the same properties: most writes are for allocation and most reads are to recently written objects. In addition, the results also showed that the write-miss policy can have a dramatic effect on the cache performance of the benchmarks and a write-validate policy gives the best performance. The comparison shows that when the input size of benchmarks is small, SICStus is about 3-8 times faster than LVM. This is an acceptable range of performance ratio for comparing a binary-code engine against a byte-code emulator. When we increase the input sizes, some benchmarks maintain this performance ratio, whereas others greatly narrow the performance gap and at certain breakthrough points perform better than their counterparts under SICStus

Lakehead University Knowledge Commons

A study of thread-local garbage collection for multi-core systems

Author: Mole Matthew Robert
Publication venue
Publication date: 01/02/2015
Field of study

With multi-processor systems in widespread use, and programmers increasingly writing programs that exploit multiple processors, scalability of application performance is more of an issue. Increasing the number of processors available to an application by a factor does not necessarily boost that application's performance by that factor. More processors can actually harm performance. One cause of poor scalability is memory bandwidth becoming saturated as processors contend with each other for memory bus use. More multi-core systems have a non-uniform memory architecture and placement of threads and the data they use is important in tackling this problem. Garbage collection is a memory load and store intensive activity, and whilst well known techniques such as concurrent and parallel garbage collection aim to increase performance with multi-core systems, they do not address the memory bottleneck problem. One garbage collection technique that can address this problem is thread-local heap garbage collection. Smaller, more frequent, garbage collection cycles are performed so that intensive memory activity is distributed. This thesis evaluates a novel thread-local heap garbage collector for Java, that is designed to improve the effectiveness of this thread-independent garbage collection

Kent Academic Repository

Garbage collection optimization for non uniform memory access architectures

Author: Alnowaiser Khaled Abdulrahman
Publication venue
Publication date: 01/01/2016
Field of study

Cache-coherent non uniform memory access (ccNUMA) architecture is a standard design pattern for contemporary multicore processors, and future generations of architectures are likely to be NUMA. NUMA architectures create new challenges for managed runtime systems. Memory-intensive applications use the system’s distributed memory banks to allocate data, and the automatic memory manager collects garbage left in these memory banks. The garbage collector may need to access remote memory banks, which entails access latency overhead and potential bandwidth saturation for the interconnection between memory banks. This dissertation makes ﬁve signiﬁcant contributions to garbage collection on NUMA systems, with a case study implementation using the Hotspot Java Virtual Machine. It empirically studies data locality for a Stop-The-World garbage collector when tracing connected objects in NUMA heaps. First, it identiﬁes a locality richness which exists naturally in connected objects that contain a root object and its reachable set— ‘rooted sub-graphs’. Second, this dissertation leverages the locality characteristic of rooted sub-graphs to develop a new NUMA-aware garbage collection mechanism. A garbage collector thread processes a local root and its reachable set, which is likely to have a large number of objects in the same NUMA node. Third, a garbage collector thread steals references from sibling threads that run on the same NUMA node to improve data locality. This research evaluates the new NUMA-aware garbage collector using seven benchmarks of an established real-world DaCapo benchmark suite. In addition, evaluation involves a widely used SPECjbb benchmark and Neo4J graph database Java benchmark, as well as an artiﬁcial benchmark. The results of the NUMA-aware garbage collector on a multi-hop NUMA architecture show an average of 15% performance improvement. Furthermore, this performance gain is shown to be as a result of an improved NUMA memory access in a ccNUMA system. Fourth, the existing Hotspot JVM adaptive policy for conﬁguring the number of garbage collection threads is shown to be suboptimal for current NUMA machines. The policy uses outdated assumptions and it generates a constant thread count. In fact, the Hotspot JVM still uses this policy in the production version. This research shows that the optimal number of garbage collection threads is application-speciﬁc and conﬁguring the optimal number of garbage collection threads yields better collection throughput than the default policy. Fifth, this dissertation designs and implements a runtime technique, which involves heuristics from dynamic collection behavior to calculate an optimal number of garbage collector threads for each collection cycle. The results show an average of 21% improvements to the garbage collection performance for DaCapo benchmarks

Glasgow Theses Service

Hybrid eager and lazy evaluation for efficient compilation of Haskell

Author: Maessen Jan-Willem
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2002
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 208-220).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.The advantage of a non-strict, purely functional language such as Haskell lies in its clean equational semantics. However, lazy implementations of Haskell fall short: they cannot express tail recursion gracefully without annotation. We describe resource-bounded hybrid evaluation, a mixture of strict and lazy evaluation, and its realization in Eager Haskell. From the programmer's perspective, Eager Haskell is simply another implementation of Haskell with the same clean equational semantics. Iteration can be expressed using tail recursion, without the need to resort to program annotations. Under hybrid evaluation, computations are ordinarily executed in program order just as in a strict functional language. When particular stack, heap, or time bounds are exceeded, suspensions are generated for all outstanding computations. These suspensions are re-started in a demand-driven fashion from the root. The Eager Haskell compiler translates Ac, the compiler's intermediate representation, to efficient C code. We use an equational semantics for Ac to develop simple correctness proofs for program transformations, and connect actions in the run-time system to steps in the hybrid evaluation strategy.(cont.) The focus of compilation is efficiency in the common case of straight-line execution; the handling of non-strictness and suspension are left to the run-time system. Several additional contributions have resulted from the implementation of hybrid evaluation. Eager Haskell is the first eager compiler to use a call stack. Our generational garbage collector uses this stack as an additional predictor of object lifetime. Objects above a stack watermark are assumed to be likely to die; we avoid promoting them. Those below are likely to remain untouched and therefore are good candidates for promotion. To avoid eagerly evaluating error checks, they are compiled into special bottom thunks, which are treated specially by the run-time system. The compiler identifies error handling code using a mixture of strictness and type information. This information is also used to avoid inlining error handlers, and to enable aggressive program transformation in the presence of error handling.by Jan-Willem Maessen.Ph.D

DSpace@MIT