4 research outputs found
Oil and Water? High Performance Garbage Collection in Java with MMTk
Increasingly popular languages such as Java and C # require efficient garbage collection. This paper presents the design, implementation, and evaluation of MMTk, a Memory Management Toolkit for and in Java. MMTk is an efficient, composable, extensible, and portable framework for building garbage collectors. MMTk uses design patterns and compiler cooperation to combine modularity and efficiency. The resulting system is more robust, easier to maintain, and has fewer defects than monolithic collectors. Experimental comparisons with monolithic Java and C implementations reveal MMTk has significant performance advantages as well. Performance critical system software typically uses monolithic C at the expense of flexibility. Our results refute common wisdom that only this approach attains efficiency, and suggest that performance critical software can embrace modular design and high-level languages.
Integrating Generations with Advanced Reference Counting Garbage Collectors
We study an incorporation of generations into a modern reference counting collector. We start with the two on-the-y collectors suggested by Levanoni and Petrank: a reference counting collector and a tracing (mark and sweep) collector. We then propose three designs for combining them so that the reference counting collector collects the young generation or the old generation or both. Our designs maintain the good properties of the Levanoni-Petrank collector. In particular, it is adequate for multithreaded environment and a multiprocessor platform, and it has an ecient write barrier with no synchronization operations. To the best of our knowledge, the use of generations with reference counting has not been tried before
High Performance Reference Counting and Conservative Garbage Collection
Garbage collection is an integral part of modern programming languages. It automatically
reclaims memory occupied by objects that are no longer in use. Garbage
collection began in 1960 with two algorithmic branches — tracing and reference counting.
Tracing identifies live objects by performing a transitive closure over the object
graph starting with the stacks, registers, and global variables as roots. Objects not
reached by the trace are implicitly dead, so the collector reclaims them. In contrast,
reference counting explicitly identifies dead objects by counting the number of incoming
references to each object. When an object’s count goes to zero, it is unreachable
and the collector may reclaim it.
Garbage collectors require knowledge of every reference to each object, whether
the reference is from another object or from within the runtime. The runtime provides
this knowledge either by continuously keeping track of every change to each reference
or by periodically enumerating all references. The collector implementation faces two
broad choices — exact and conservative. In exact garbage collection, the compiler and
runtime system precisely identify all references held within the runtime including
those held within stacks, registers, and objects. To exactly identify references, the
runtime must introspect these references during execution, which requires support
from the compiler and significant engineering effort. On the contrary, conservative
garbage collection does not require introspection of these references, but instead
treats each value ambiguously as a potential reference.
Highly engineered, high performance systems conventionally use tracing and
exact garbage collection. However, other well-established but less performant systems
use either reference counting or conservative garbage collection. Reference counting has
some advantages over tracing such as: a) it is easier implement, b) it reclaims memory
immediately, and c) it has a local scope of operation. Conservative garbage collection
is easier to implement compared to exact garbage collection because it does not
require compiler cooperation. Because of these advantages, both reference counting
and conservative garbage collection are widely used in practice. Because both suffer
significant performance overheads, they are generally not used in performance critical
settings. This dissertation carefully examines reference counting and conservative
garbage collection to understand their behavior and improve their performance.
My thesis is that reference counting and conservative garbage collection can perform
as well or better than the best performing garbage collectors.
The key contributions of my thesis are: 1) An in-depth analysis of the key design
choices for reference counting. 2) Novel optimizations guided by that analysis that
significantly improve reference counting performance and make it competitive with
a well tuned tracing garbage collector. 3) A new collector, RCImmix, that replaces
the traditional free-list heap organization of reference counting with a line and block heap structure, which improves locality, and adds copying to mitigate fragmentation.
The result is a collector that outperforms a highly tuned production generational
collector. 4) A conservative garbage collector based on RCImmix that matches the
performance of a highly tuned production generational collector.
Reference counting and conservative garbage collection have lived under the
shadow of tracing and exact garbage collection for a long time. My thesis focuses
on bringing these somewhat neglected branches of garbage collection back to life
in a high performance setting and leads to two very surprising results: 1) a new
garbage collector based on reference counting that outperforms a highly tuned production
generational tracing collector, and 2) a variant that delivers high performance
conservative garbage collection