2,427 research outputs found
Prioritized Garbage Collection: Explicit GC Support for Software Caches
Programmers routinely trade space for time to increase performance, often in
the form of caching or memoization. In managed languages like Java or
JavaScript, however, this space-time tradeoff is complex. Using more space
translates into higher garbage collection costs, especially at the limit of
available memory. Existing runtime systems provide limited support for
space-sensitive algorithms, forcing programmers into difficult and often
brittle choices about provisioning.
This paper presents prioritized garbage collection, a cooperative programming
language and runtime solution to this problem. Prioritized GC provides an
interface similar to soft references, called priority references, which
identify objects that the collector can reclaim eagerly if necessary. The key
difference is an API for defining the policy that governs when priority
references are cleared and in what order. Application code specifies a priority
value for each reference and a target memory bound. The collector reclaims
references, lowest priority first, until the total memory footprint of the
cache fits within the bound. We use this API to implement a space-aware
least-recently-used (LRU) cache, called a Sache, that is a drop-in replacement
for existing caches, such as Google's Guava library. The garbage collector
automatically grows and shrinks the Sache in response to available memory and
workload with minimal provisioning information from the programmer. Using a
Sache, it is almost impossible for an application to experience a memory leak,
memory pressure, or an out-of-memory crash caused by software caching.Comment: to appear in OOPSLA 201
Timed Consistent Network Updates
Network updates such as policy and routing changes occur frequently in
Software Defined Networks (SDN). Updates should be performed consistently,
preventing temporary disruptions, and should require as little overhead as
possible. Scalability is increasingly becoming an essential requirement in SDN.
In this paper we propose to use time-triggered network updates to achieve
consistent updates. Our proposed solution requires lower overhead than existing
update approaches, without compromising the consistency during the update. We
demonstrate that accurate time enables far more scalable consistent updates in
SDN than previously available. In addition, it provides the SDN programmer with
fine-grained control over the tradeoff between consistency and scalability.Comment: This technical report is an extended version of the paper "Timed
Consistent Network Updates", which was accepted to the ACM SIGCOMM Symposium
on SDN Research (SOSR) '15, Santa Clara, CA, US, June 201
TLB pre-loading for Java applications
The increasing memory requirement for today\u27s applications is causing more stress for the memory system. This side effect puts pressure into available caches, and specifically the TLB cache. TLB misses are responsible for a considerable ratio of the total memory latency, since an average of 10% of execution time is wasted on miss penalties. Java applications are not in a better position. Their attractive features increase the memory footprint. Generally, Java applications TLB miss rate tends to be multiples of miss rate for non-java applications. The high miss rate will cause the application to loose valuable execution time. Our experiments show that on average, miss penalty can constitute about 24% of execution time. Several hardware modifications were suggested to reduce TLB misses for general applications. However, to the best of our knowledge, there have been no similar efforts for Java applications. Here we propose a software-based prediction model that relies on information available to the virtual machine. The model uses the write barrier operation to predict TLB misses with an average 41% accuracy rate
Real-Time Memory Management: Life and Times
As high integrity real-time systems become increasingly large and complex, forcing a static model of memory usage becomes untenable. The challenge is to provide a dynamic memory model that guarantees tight and bounded time and space requirements without overburdening the developer with memory concerns. This paper provides an analysis of memory management approaches in order to characterise the tradeoffs across three semantic domains: space, time and a characterisation of memory usage information such as the lifetime of objects. A unified approach to distinguishing the merits of each memory model highlights the relationship across these three domains, thereby identifying the class of applications that benefit from targeting a particular model. Crucially, an initial investigation of this relationship identifies the direction future research must take in order to address the requirements of the next generation of complex embedded systems. Some initial suggestions are made in this regard and the memory model proposed in the Real-Time Specification for Java is evaluated in this context
Exploiting the Weak Generational Hypothesis for Write Reduction and Object Recycling
Programming languages with automatic memory management are continuing to grow in popularity due to ease of programming. However, these languages tend to allocate objects excessively, leading to inefficient use of memory and large garbage collection and allocation overheads.
The weak generational hypothesis notes that objects tend to die young in languages with automatic dynamic memory management. Much work has been done to optimize allocation and garbage collection algorithms based on this observation. Previous work has largely focused on developing efficient software algorithms for allocation and collection. However, much less work has studied architectural solutions. In this work, we propose and evaluate architectural support for assisting allocation and garbage collection.
We first study the effects of languages with automatic memory management on the memory system. As objects often die young, it is likely many objects die while in the processor\u27s caches. Writes of dead data back to main memory are unnecessary, as the data will never be used again. To study this, we develop and present architecture support to identify dead objects while they remain resident in cache and eliminate any unnecessary writes. We show that many writes out of the caches are unnecessary, and can be avoided using our hardware additions.
Next, we study the effects of using dead data in cache to assist with allocation and garbage collection. Logic is developed and presented to allow for reuse of cache space found dead to satisfy future allocation requests. We show that dead cache space can be recycled at a high rate, reducing pressure on the allocator and reducing cache miss rates. However, a full implementation of our initial approach is shown to be unscalable. We propose and study limitations to our approach, trading object coverage for scalability.
Third, we present a new approach for identifying objects that die young based on a limitation of our previous approach. We show this approach has much lower storage and logic requirements and is scalable, while only slightly decreasing overall object coverage
NMFLUX: Improving Degradation Behavior of Server Applications through Dynamic Nursery Resizing
Currently, most generational collectors are tuned to either deliver peak performance when the heap is plentiful, but yield unacceptable performance when the heap is tight or maintain good degradation behavior when the heap is tight, but deliver sub-optimal performance when the heap is plentiful. In this paper, we present NMFLUX (continuously varying the Nursery/Mature ratio), a framework that switches between using a fixed-nursery generational collector and a variable-nursery collector to achieve the best of both worlds; i.e. our framework delivers optimal performance under normal workload, and graceful performance degradation under heavy workload. We use this framework to create two generational garbage collectors and evaluate their performances in both desktop and server settings. The experimental results show that our proposed collectors can significantly improve the throughput degradation behavior of large servers while maintaining similar peak performance to the optimally configured fixed-ratio collector
Distilling the Real Cost of Production Garbage Collectors
Abridged abstract: despite the long history of garbage collection (GC) and
its prevalence in modern programming languages, there is surprisingly little
clarity about its true cost. Without understanding their cost, crucial
tradeoffs made by garbage collectors (GCs) go unnoticed. This can lead to
misguided design constraints and evaluation criteria used by GC researchers and
users, hindering the development of high-performance, low-cost GCs. In this
paper, we develop a methodology that allows us to empirically estimate the cost
of GC for any given set of metrics. By distilling out the explicitly
identifiable GC cost, we estimate the intrinsic application execution cost
using different GCs. The minimum distilled cost forms a baseline. Subtracting
this baseline from the total execution costs, we can then place an empirical
lower bound on the absolute costs of different GCs. Using this methodology, we
study five production GCs in OpenJDK 17, a high-performance Java runtime. We
measure the cost of these collectors, and expose their respective key
performance tradeoffs. We find that with a modestly sized heap, production GCs
incur substantial overheads across a diverse suite of modern benchmarks,
spending at least 7-82% more wall-clock time and 6-92% more CPU cycles relative
to the baseline cost. We show that these costs can be masked by concurrency and
generous provisioning of memory/compute. In addition, we find that newer
low-pause GCs are significantly more expensive than older GCs, and,
surprisingly, sometimes deliver worse application latency than stop-the-world
GCs. Our findings reaffirm that GC is by no means a solved problem and that a
low-cost, low-latency GC remains elusive. We recommend adopting the
distillation methodology together with a wider range of cost metrics for future
GC evaluations.Comment: Camera-ready versio
- …