9 research outputs found
Down for the Count? Getting Reference Counting Back in the Ring
Reference counting and tracing are the two fundamental approaches that have underpinned garbage collection since 1960. However, despite some compelling advantages, reference counting is almost completely ignored in implementations of high performance systems today. In this paper we take a detailed look at reference counting to understand its behavior and to improve its performance. We identify key design choices for reference counting and analyze how the behavior of a wide range of benchmarks might affect design decisions. As far as we are aware, this is the first such quantitative study of reference counting. We use insights gleaned from this analysis to introduce a number of optimizations that significantly improve the performance of reference counting. We find that an existing modern implementation of reference counting has an average 30% overhead compared to tracing, and that in combination, our optimizations are able to completely eliminate that overhead. This brings the performance of reference counting on par with that of a well tuned mark-sweep collector. We keep our in-depth analysis of reference counting as general as possible so that it may be useful to other garbage collector implementers. Our finding that reference counting can be made directly competitive with well tuned mark-sweep should shake the community's prejudices about reference counting and perhaps open new opportunities for exploiting reference counting's strengths, such as localization and immediacy of reclamation
Garbage Collection for General Graphs
Garbage collection is moving from being a utility to a requirement of every modern programming language. With multi-core and distributed systems, most programs written recently are heavily multi-threaded and distributed. Distributed and multi-threaded programs are called concurrent programs. Manual memory management is cumbersome and difficult in concurrent programs. Concurrent programming is characterized by multiple independent processes/threads, communication between processes/threads, and uncertainty in the order of concurrent operations. The uncertainty in the order of operations makes manual memory management of concurrent programs difficult. A popular alternative to garbage collection in concurrent programs is to use smart pointers. Smart pointers can collect all garbage only if developer identifies cycles being created in the reference graph. Smart pointer usage does not guarantee protection from memory leaks unless cycle can be detected as process/thread create them. General garbage collectors, on the other hand, can avoid memory leaks, dangling pointers, and double deletion problems in any programming environment without help from the programmer. Concurrent programming is used in shared memory and distributed memory systems. State of the art shared memory systems use a single concurrent garbage collector thread that processes the reference graph. Distributed memory systems have very few complete garbage collection algorithms and those that exist use global barriers, are centralized and do not scale well. This thesis focuses on designing garbage collection algorithms for shared memory and distributed memory systems that satisfy the following properties: concurrent, parallel, scalable, localized (decentralized), low pause time, high promptness, no global synchronization, safe, complete, and operates in linear time
Micro Virtual Machines: A Solid Foundation for Managed Language Implementation
Today new programming languages proliferate, but many of them
suffer from
poor performance and inscrutable semantics. We assert that the
root of
many of the performance and semantic problems of today's
languages is
that language implementation is extremely difficult. This
thesis
addresses the fundamental challenges of efficiently developing
high-level
managed languages.
Modern high-level languages provide abstractions over execution,
memory
management and concurrency. It requires enormous intellectual
capability
and engineering effort to properly manage these concerns.
Lacking such
resources, developers usually choose naive implementation
approaches
in the early stages of language design, a strategy which too
often has
long-term consequences, hindering the future development of the
language. Existing language development platforms have failed
to
provide the right level of abstraction, and forced implementers
to
reinvent low-level mechanisms in order to obtain performance.
My thesis is that the introduction of micro virtual machines will
allow
the development of higher-quality, high-performance managed
languages.
The first contribution of this thesis is the design of Mu, with
the
specification of Mu as the main outcome. Mu is
the first micro virtual machine, a robust, performant, and
light-weight
abstraction over just three concerns: execution, concurrency and
garbage
collection. Such a foundation attacks three of the most
fundamental and
challenging issues that face existing language designs and
implementations, leaving the language implementers free to focus
on the
higher levels of their language design.
The second contribution is an in-depth analysis of on-stack
replacement
and its efficient implementation. This low-level mechanism
underpins
run-time feedback-directed optimisation, which is key to the
efficient
implementation of dynamic languages.
The third contribution is demonstrating the viability of Mu
through
RPython, a real-world non-trivial language implementation. We
also did
some preliminary research of GHC as a Mu client.
We have created the Mu specification and its reference
implementation,
both of which are open-source. We show that that Mu's on-stack
replacement API can gracefully support dynamic languages such as
JavaScript, and it is implementable on concrete hardware. Our
RPython
client has been able to translate and execute non-trivial
RPython
programs, and can run the RPySOM interpreter and the core of the
PyPy
interpreter.
With micro virtual machines providing a low-level substrate,
language
developers now have the option to build their next language on a
micro
virtual machine. We believe that the quality of programming
languages
will be improved as a result
High Performance Reference Counting and Conservative Garbage Collection
Garbage collection is an integral part of modern programming languages. It automatically
reclaims memory occupied by objects that are no longer in use. Garbage
collection began in 1960 with two algorithmic branches — tracing and reference counting.
Tracing identifies live objects by performing a transitive closure over the object
graph starting with the stacks, registers, and global variables as roots. Objects not
reached by the trace are implicitly dead, so the collector reclaims them. In contrast,
reference counting explicitly identifies dead objects by counting the number of incoming
references to each object. When an object’s count goes to zero, it is unreachable
and the collector may reclaim it.
Garbage collectors require knowledge of every reference to each object, whether
the reference is from another object or from within the runtime. The runtime provides
this knowledge either by continuously keeping track of every change to each reference
or by periodically enumerating all references. The collector implementation faces two
broad choices — exact and conservative. In exact garbage collection, the compiler and
runtime system precisely identify all references held within the runtime including
those held within stacks, registers, and objects. To exactly identify references, the
runtime must introspect these references during execution, which requires support
from the compiler and significant engineering effort. On the contrary, conservative
garbage collection does not require introspection of these references, but instead
treats each value ambiguously as a potential reference.
Highly engineered, high performance systems conventionally use tracing and
exact garbage collection. However, other well-established but less performant systems
use either reference counting or conservative garbage collection. Reference counting has
some advantages over tracing such as: a) it is easier implement, b) it reclaims memory
immediately, and c) it has a local scope of operation. Conservative garbage collection
is easier to implement compared to exact garbage collection because it does not
require compiler cooperation. Because of these advantages, both reference counting
and conservative garbage collection are widely used in practice. Because both suffer
significant performance overheads, they are generally not used in performance critical
settings. This dissertation carefully examines reference counting and conservative
garbage collection to understand their behavior and improve their performance.
My thesis is that reference counting and conservative garbage collection can perform
as well or better than the best performing garbage collectors.
The key contributions of my thesis are: 1) An in-depth analysis of the key design
choices for reference counting. 2) Novel optimizations guided by that analysis that
significantly improve reference counting performance and make it competitive with
a well tuned tracing garbage collector. 3) A new collector, RCImmix, that replaces
the traditional free-list heap organization of reference counting with a line and block heap structure, which improves locality, and adds copying to mitigate fragmentation.
The result is a collector that outperforms a highly tuned production generational
collector. 4) A conservative garbage collector based on RCImmix that matches the
performance of a highly tuned production generational collector.
Reference counting and conservative garbage collection have lived under the
shadow of tracing and exact garbage collection for a long time. My thesis focuses
on bringing these somewhat neglected branches of garbage collection back to life
in a high performance setting and leads to two very surprising results: 1) a new
garbage collector based on reference counting that outperforms a highly tuned production
generational tracing collector, and 2) a variant that delivers high performance
conservative garbage collection
Cautiously Optimistic Program Analyses for Secure and Reliable Software
Modern computer systems still have various security and reliability vulnerabilities. Well-known dynamic analyses solutions can mitigate them using runtime monitors that serve as lifeguards. But the additional work in enforcing these security and safety properties incurs exorbitant performance costs, and such tools are rarely used in practice. Our work addresses this problem by constructing a novel technique- Cautiously Optimistic Program Analysis (COPA).
COPA is optimistic- it infers likely program invariants from dynamic observations, and assumes them in its static reasoning to precisely identify and elide wasteful runtime monitors. The resulting system is fast, but also ensures soundness by recovering to a conservatively optimized analysis when a likely invariant rarely fails at runtime. COPA is also cautious- by carefully restricting optimizations to only safe elisions, the recovery is greatly simplified. It avoids unbounded rollbacks upon recovery, thereby enabling analysis for live production software.
We demonstrate the effectiveness of Cautiously Optimistic Program Analyses in three areas:
Information-Flow Tracking (IFT) can help prevent security breaches and information leaks. But they are rarely used in practice due to their high performance overhead (>500% for web/email servers). COPA dramatically reduces this cost by eliding wasteful IFT monitors to make it practical (9% overhead, 4x speedup).
Automatic Garbage Collection (GC) in managed languages (e.g. Java) simplifies programming tasks while ensuring memory safety. However, there is no correct GC for weakly-typed languages (e.g. C/C++), and manual memory management is prone to errors that have been exploited in high profile attacks. We develop the first sound GC for C/C++, and use COPA to optimize its performance (16% overhead).
Sequential Consistency (SC) provides intuitive semantics to concurrent programs that simplifies reasoning for their correctness. However, ensuring SC behavior on commodity hardware remains expensive. We use COPA to ensure SC for Java at the language-level efficiently, and significantly reduce its cost (from 24% down to 5% on x86).
COPA provides a way to realize strong software security, reliability and semantic guarantees at practical costs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/170027/1/subarno_1.pd