14,810 research outputs found
Energy-Efficient Algorithms
We initiate the systematic study of the energy complexity of algorithms (in
addition to time and space complexity) based on Landauer's Principle in
physics, which gives a lower bound on the amount of energy a system must
dissipate if it destroys information. We propose energy-aware variations of
three standard models of computation: circuit RAM, word RAM, and
transdichotomous RAM. On top of these models, we build familiar high-level
primitives such as control logic, memory allocation, and garbage collection
with zero energy complexity and only constant-factor overheads in space and
time complexity, enabling simple expression of energy-efficient algorithms. We
analyze several classic algorithms in our models and develop low-energy
variations: comparison sort, insertion sort, counting sort, breadth-first
search, Bellman-Ford, Floyd-Warshall, matrix all-pairs shortest paths, AVL
trees, binary heaps, and dynamic arrays. We explore the time/space/energy
trade-off and develop several general techniques for analyzing algorithms and
reducing their energy complexity. These results lay a theoretical foundation
for a new field of semi-reversible computing and provide a new framework for
the investigation of algorithms.Comment: 40 pages, 8 pdf figures, full version of work published in ITCS 201
Pruning, Pushdown Exception-Flow Analysis
Statically reasoning in the presence of exceptions and about the effects of
exceptions is challenging: exception-flows are mutually determined by
traditional control-flow and points-to analyses. We tackle the challenge of
analyzing exception-flows from two angles. First, from the angle of pruning
control-flows (both normal and exceptional), we derive a pushdown framework for
an object-oriented language with full-featured exceptions. Unlike traditional
analyses, it allows precise matching of throwers to catchers. Second, from the
angle of pruning points-to information, we generalize abstract garbage
collection to object-oriented programs and enhance it with liveness analysis.
We then seamlessly weave the techniques into enhanced reachability computation,
yielding highly precise exception-flow analysis, without becoming intractable,
even for large applications. We evaluate our pruned, pushdown exception-flow
analysis, comparing it with an established analysis on large scale standard
Java benchmarks. The results show that our analysis significantly improves
analysis precision over traditional analysis within a reasonable analysis time.Comment: 14th IEEE International Working Conference on Source Code Analysis
and Manipulatio
Beltway: Getting Around Garbage Collection Gridlock
We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts independently, and collects increments on a belt in first-in-first-out order. We show that Beltway configurations, selected by command line options, act and perform the same as semi-space, generational, and older-first collectors, and encompass all previous copying collectors of which we are aware. The increasing reliance on garbage collected languages such as Java requires that the collector perform well. We show that the generality of Beltway enables us to design and implement new collectors that are robust to variations in heap size and improve total execution time over the best generational copying collectors of which we are aware by up to 40%, and on average by 5 to 10%, for small to moderate heap sizes. New garbage collection algorithms are rare, and yet we define not just one, but a new family of collectors that subsumes previous work. This generality enables us to explore a larger design space and build better collectors
Prioritized Garbage Collection: Explicit GC Support for Software Caches
Programmers routinely trade space for time to increase performance, often in
the form of caching or memoization. In managed languages like Java or
JavaScript, however, this space-time tradeoff is complex. Using more space
translates into higher garbage collection costs, especially at the limit of
available memory. Existing runtime systems provide limited support for
space-sensitive algorithms, forcing programmers into difficult and often
brittle choices about provisioning.
This paper presents prioritized garbage collection, a cooperative programming
language and runtime solution to this problem. Prioritized GC provides an
interface similar to soft references, called priority references, which
identify objects that the collector can reclaim eagerly if necessary. The key
difference is an API for defining the policy that governs when priority
references are cleared and in what order. Application code specifies a priority
value for each reference and a target memory bound. The collector reclaims
references, lowest priority first, until the total memory footprint of the
cache fits within the bound. We use this API to implement a space-aware
least-recently-used (LRU) cache, called a Sache, that is a drop-in replacement
for existing caches, such as Google's Guava library. The garbage collector
automatically grows and shrinks the Sache in response to available memory and
workload with minimal provisioning information from the programmer. Using a
Sache, it is almost impossible for an application to experience a memory leak,
memory pressure, or an out-of-memory crash caused by software caching.Comment: to appear in OOPSLA 201
Towards co-designed optimizations in parallel frameworks: A MapReduce case study
The explosion of Big Data was followed by the proliferation of numerous
complex parallel software stacks whose aim is to tackle the challenges of data
deluge. A drawback of a such multi-layered hierarchical deployment is the
inability to maintain and delegate vital semantic information between layers in
the stack. Software abstractions increase the semantic distance between an
application and its generated code. However, parallel software frameworks
contain inherent semantic information that general purpose compilers are not
designed to exploit.
This paper presents a case study demonstrating how the specific semantic
information of the MapReduce paradigm can be exploited on multicore
architectures. MR4J has been implemented in Java and evaluated against
hand-optimized C and C++ equivalents. The initial observed results led to the
design of a semantically aware optimizer that runs automatically without
requiring modification to application code.
The optimizer is able to speedup the execution time of MR4J by up to 2.0x.
The introduced optimization not only improves the performance of the generated
code, during the map phase, but also reduces the pressure on the garbage
collector. This demonstrates how semantic information can be harnessed without
sacrificing sound software engineering practices when using parallel software
frameworks.Comment: 8 page
Relating goal scheduling, precedence, and memory management in and-parallel execution of logic programs
The interactions among three important issues involved in the implementation of logic programs in parallel (goal scheduling, precedence, and memory management) are discussed. A simplified, parallel memory management model and an efficient, load-balancing goal scheduling strategy are presented. It is shown how, for systems which support "don't know" non-determinism, special care has to be taken during goal scheduling if the space recovery characteristics
of sequential systems are to be preserved. A solution based on selecting only "newer" goals for execution is described, and an algorithm is proposed for efficiently maintaining and determining precedence relationships and variable ages across parallel goals. It is argued that the proposed schemes and algorithms make it possible to extend the storage performance of sequential systems to parallel execution without the considerable overhead previously associated with it. The results are applicable to a wide class of parallel and coroutining systems, and they represent an efficient alternative to "all heap" or "spaghetti stack" allocation models
- …