9 research outputs found
A fast analysis for thread-local garbage collection with dynamic class loading
Long-running, heavily multi-threaded, Java server applications make stringent demands of garbage collector (GC) performance. Synchronisation of all application threads before garbage collection is a significant bottleneck for JVMs that use native threads. We present a new static analysis and a novel GC framework designed to address this issue by allowing independent collection of thread-local heaps. In contrast to previous work, our solution safely classifies objects even in the presence of dynamic class loading, requires neither write-barriers that may do unbounded work, nor synchronisation, nor locks during thread-local collections; our analysis is sufficiently fast to permit its integration into a high-performance, production-quality virtual machine
Garbage Collection for General Graphs
Garbage collection is moving from being a utility to a requirement of every modern programming language. With multi-core and distributed systems, most programs written recently are heavily multi-threaded and distributed. Distributed and multi-threaded programs are called concurrent programs. Manual memory management is cumbersome and difficult in concurrent programs. Concurrent programming is characterized by multiple independent processes/threads, communication between processes/threads, and uncertainty in the order of concurrent operations. The uncertainty in the order of operations makes manual memory management of concurrent programs difficult. A popular alternative to garbage collection in concurrent programs is to use smart pointers. Smart pointers can collect all garbage only if developer identifies cycles being created in the reference graph. Smart pointer usage does not guarantee protection from memory leaks unless cycle can be detected as process/thread create them. General garbage collectors, on the other hand, can avoid memory leaks, dangling pointers, and double deletion problems in any programming environment without help from the programmer. Concurrent programming is used in shared memory and distributed memory systems. State of the art shared memory systems use a single concurrent garbage collector thread that processes the reference graph. Distributed memory systems have very few complete garbage collection algorithms and those that exist use global barriers, are centralized and do not scale well. This thesis focuses on designing garbage collection algorithms for shared memory and distributed memory systems that satisfy the following properties: concurrent, parallel, scalable, localized (decentralized), low pause time, high promptness, no global synchronization, safe, complete, and operates in linear time
Using Class-Level Static Properties to Predict Object Lifetimes
Today, most modern programming languages such as C # or Java use an automatic memory management system also known as a Garbage Collector (GC). Over the course of program execution, new objects are allocated in memory, and some older objects become unreachable (die). In order for the program to keep running, it becomes necessary to free the memory of dead objects; this task is performed periodically by the GC.
Research has shown that most objects die young and as a result, generational collectors have become very popular over the years. Yet, these algorithms are not good at handling long-lived objects. Typically, long-lived objects would first be allocated in the nursery space and be promoted (copied) to an older generation after surviving a garbage collection, hence wasting precious time.
By allocating long-lived and immortal objects directly into infrequently or never collected regions, pretenuring can reduce garbage collection costs significantly. Current state of the art methodology to predict object lifetime involves off-line profiling combined with a simple, heuristic classification. Profiling is slow (can take days), requires gathering gigabytes of data that need to be analysed (can take hours), and needs to be repeated for every previously unseen program.
This thesis explores the space of lifetime predictions and shows how object lifetimes can be predicted accurately and quickly using simple program characteristics gathered within minutes. Following an innovative methodology introduced in this thesis, object lifetime predictions are fed into a specifically modified Java virtual machine. Performance tests show gains in GC times of as much as 77% for the “SPEC jvm98” benchmarks, against a generational copying collector
HW-SW co-design techniques for modern programming languages
Modern programming languages raise the level of abstraction, hide the details of computer systems from programmers, and provide many convenient features. Such strong abstraction from the details of computer systems with runtime support of many convenient features increases the productivity of programmers.
Such benefits, however, come with performance overheads. First, many of modern programming languages use a dynamic type system which incurs overheads of profiling program execution and generating specialized codes in the middle of execution. Second, such specialized codes constantly add overheads of dynamic type checks. Third, most of modern programming languages use automatic memory management which incurs memory overheads due to metadata and delayed reclamation as well as execution time overheads due to garbage collection operations.
This thesis makes three contributions to address the overheads of modern programming languages. First, it describes the enhancements to the compiler of dynamic scripting languages necessary to enable sharing of compilation results across executions. These compilers have been developed with little consideration for reusing optimization efforts across executions since it is considered difficult due to dynamic nature of the languages. As a first step toward enabling the reuse of compilation results of dynamic scripting languages, it focuses on inline caching (IC) which is one of the fundamental optimization techniques for dynamic type systems. Second, it describes a HW-SW co-design technique to further improve IC operations. While the first proposal focuses on expensive IC miss handling during JavaScript initialization, the second proposal accelerates IC hit operations to improve the overall performance. Lastly, it describes how to exploit common sharing patterns of programs to reduce overheads of reference counting for garbage collection. It minimizes atomic operations in reference counting by biasing each object to a specific thread
High Performance Reference Counting and Conservative Garbage Collection
Garbage collection is an integral part of modern programming languages. It automatically
reclaims memory occupied by objects that are no longer in use. Garbage
collection began in 1960 with two algorithmic branches — tracing and reference counting.
Tracing identifies live objects by performing a transitive closure over the object
graph starting with the stacks, registers, and global variables as roots. Objects not
reached by the trace are implicitly dead, so the collector reclaims them. In contrast,
reference counting explicitly identifies dead objects by counting the number of incoming
references to each object. When an object’s count goes to zero, it is unreachable
and the collector may reclaim it.
Garbage collectors require knowledge of every reference to each object, whether
the reference is from another object or from within the runtime. The runtime provides
this knowledge either by continuously keeping track of every change to each reference
or by periodically enumerating all references. The collector implementation faces two
broad choices — exact and conservative. In exact garbage collection, the compiler and
runtime system precisely identify all references held within the runtime including
those held within stacks, registers, and objects. To exactly identify references, the
runtime must introspect these references during execution, which requires support
from the compiler and significant engineering effort. On the contrary, conservative
garbage collection does not require introspection of these references, but instead
treats each value ambiguously as a potential reference.
Highly engineered, high performance systems conventionally use tracing and
exact garbage collection. However, other well-established but less performant systems
use either reference counting or conservative garbage collection. Reference counting has
some advantages over tracing such as: a) it is easier implement, b) it reclaims memory
immediately, and c) it has a local scope of operation. Conservative garbage collection
is easier to implement compared to exact garbage collection because it does not
require compiler cooperation. Because of these advantages, both reference counting
and conservative garbage collection are widely used in practice. Because both suffer
significant performance overheads, they are generally not used in performance critical
settings. This dissertation carefully examines reference counting and conservative
garbage collection to understand their behavior and improve their performance.
My thesis is that reference counting and conservative garbage collection can perform
as well or better than the best performing garbage collectors.
The key contributions of my thesis are: 1) An in-depth analysis of the key design
choices for reference counting. 2) Novel optimizations guided by that analysis that
significantly improve reference counting performance and make it competitive with
a well tuned tracing garbage collector. 3) A new collector, RCImmix, that replaces
the traditional free-list heap organization of reference counting with a line and block heap structure, which improves locality, and adds copying to mitigate fragmentation.
The result is a collector that outperforms a highly tuned production generational
collector. 4) A conservative garbage collector based on RCImmix that matches the
performance of a highly tuned production generational collector.
Reference counting and conservative garbage collection have lived under the
shadow of tracing and exact garbage collection for a long time. My thesis focuses
on bringing these somewhat neglected branches of garbage collection back to life
in a high performance setting and leads to two very surprising results: 1) a new
garbage collector based on reference counting that outperforms a highly tuned production
generational tracing collector, and 2) a variant that delivers high performance
conservative garbage collection