131 research outputs found
Distilling the Real Cost of Production Garbage Collectors
Abridged abstract: despite the long history of garbage collection (GC) and
its prevalence in modern programming languages, there is surprisingly little
clarity about its true cost. Without understanding their cost, crucial
tradeoffs made by garbage collectors (GCs) go unnoticed. This can lead to
misguided design constraints and evaluation criteria used by GC researchers and
users, hindering the development of high-performance, low-cost GCs. In this
paper, we develop a methodology that allows us to empirically estimate the cost
of GC for any given set of metrics. By distilling out the explicitly
identifiable GC cost, we estimate the intrinsic application execution cost
using different GCs. The minimum distilled cost forms a baseline. Subtracting
this baseline from the total execution costs, we can then place an empirical
lower bound on the absolute costs of different GCs. Using this methodology, we
study five production GCs in OpenJDK 17, a high-performance Java runtime. We
measure the cost of these collectors, and expose their respective key
performance tradeoffs. We find that with a modestly sized heap, production GCs
incur substantial overheads across a diverse suite of modern benchmarks,
spending at least 7-82% more wall-clock time and 6-92% more CPU cycles relative
to the baseline cost. We show that these costs can be masked by concurrency and
generous provisioning of memory/compute. In addition, we find that newer
low-pause GCs are significantly more expensive than older GCs, and,
surprisingly, sometimes deliver worse application latency than stop-the-world
GCs. Our findings reaffirm that GC is by no means a solved problem and that a
low-cost, low-latency GC remains elusive. We recommend adopting the
distillation methodology together with a wider range of cost metrics for future
GC evaluations.Comment: Camera-ready versio
Remote Opportunities: A Rethinking and Retooling
Abstract Introducing technology as a sustainable means of creating, connecting, and collaborating reveals the need to carefully consider subtle aspects of deployment strategies and support in remote regions. In order to comprehensively address both cultural and technical issues for educational infrastructure, we consider two elements to be key: (1) a staged deployment approach, involving both educators and community members, coupled with (2) uniquely designed collaborative Integrated Development Environments (IDEs) to aid constructivism. This paper presents our current experience with these elements in the context of a pilot project for aboriginal communities on the west coast of British Columbia. Currently, these local communities have been working alongside our group for a staged deployment of programs throughout southern Vancouver Island. In our next phase we will be extending this to more remote regions in the north island and coastal regions. By building on a philosophy of CommunityDriven Initiatives for Technology (C-DIT), we hope to secure community involvement in the development and testing of necessary tool support. These tools specifically target IDEs for the development of programming skills, and support our long term goal to help secondary and postsecondary level students appreciate both the process and the art of programming
Towards Zero-Overhead Disambiguation of Deep Priority Conflicts
**Context** Context-free grammars are widely used for language prototyping
and implementation. They allow formalizing the syntax of domain-specific or
general-purpose programming languages concisely and declaratively. However, the
natural and concise way of writing a context-free grammar is often ambiguous.
Therefore, grammar formalisms support extensions in the form of *declarative
disambiguation rules* to specify operator precedence and associativity, solving
ambiguities that are caused by the subset of the grammar that corresponds to
expressions.
**Inquiry** Implementing support for declarative disambiguation within a
parser typically comes with one or more of the following limitations in
practice: a lack of parsing performance, or a lack of modularity (i.e.,
disallowing the composition of grammar fragments of potentially different
languages). The latter subject is generally addressed by scannerless
generalized parsers. We aim to equip scannerless generalized parsers with novel
disambiguation methods that are inherently performant, without compromising the
concerns of modularity and language composition.
**Approach** In this paper, we present a novel low-overhead implementation
technique for disambiguating deep associativity and priority conflicts in
scannerless generalized parsers with lightweight data-dependency.
**Knowledge** Ambiguities with respect to operator precedence and
associativity arise from combining the various operators of a language. While
*shallow conflicts* can be resolved efficiently by one-level tree patterns,
*deep conflicts* require more elaborate techniques, because they can occur
arbitrarily nested in a tree. Current state-of-the-art approaches to solving
deep priority conflicts come with a severe performance overhead.
**Grounding** We evaluated our new approach against state-of-the-art
declarative disambiguation mechanisms. By parsing a corpus of popular
open-source repositories written in Java and OCaml, we found that our approach
yields speedups of up to 1.73x over a grammar rewriting technique when parsing
programs with deep priority conflicts--with a modest overhead of 1-2 % when
parsing programs without deep conflicts.
**Importance** A recent empirical study shows that deep priority conflicts
are indeed wide-spread in real-world programs. The study shows that in a corpus
of popular OCaml projects on Github, up to 17 % of the source files contain
deep priority conflicts. However, there is no solution in the literature that
addresses efficient disambiguation of deep priority conflicts, with support for
modular and composable syntax definitions
First Class Copy & Paste
The Subtext project seeks to make programming fundamentally easier by altering the nature of programming languages and tools. This paper defines an operational semantics for an essential subset of the Subtext language. It also presents a fresh approach to the problems of mutable state, I/O, and concurrency.Inclusions reify copy & paste edits into persistent relationships that propagate changes from their source into their destination. Inclusions formulate a programming language in which there is no distinction between a programÂs representation and its execution. Like spreadsheets, programs are live executions within a persistent runtime, and programming is direct manipulation of these executions via a graphical user interface. There is no need to encode programs into source text.Mutation of state is effected by the computation of hypothetical recursive variants of the state, which can then be lifted into new versions of the state. Transactional concurrency is based upon queued single-threaded execution. Speculative execution of queued hypotheticals provides concurrency as a semantically transparent implementation optimization
Locking Discipline Inference and Checking
Concurrency is a requirement for much modern software, but the implementation of multithreaded algorithms comes at the risk of errors such as data races. Programmers can prevent data races by documenting and obeying a locking discipline, which indicates which locks must be held in order to access which data. This paper introduces a formal semantics for locking specifications that gives a guarantee of race freedom. The paper also provides two implementations of the formal semantics for the Java language: one based on abstract interpretation and one based on type theory. To the best of our knowledge, these are the first tools that can soundly infer and check a locking discipline for Java. Our experiments com-pare the implementations with one another and with annotations written by programmers
Safer typing of complex API usage through Java generics
When several incompatible implementations of a single API are in use in a Java program, the danger exists that instances from different implementations may inadvertently be mixed, leading to errors. In this paper we show how to use generics to prevent such mixing. The core idea of the approach is to add a type parameter to the interfaces of the API, and tie the classes that make up an implementation to a unique choice of type parameter. In this way methods of the API can only be invoked with arguments that belong to the same implementation. We show that the presence of a type parameter in the interfaces does not violate the principle of interface-based programming: clients can still completely abstract over the choice of implementation. In addition, we demonstrate how code can be reused between different implementations, how implementations can be defined as extensions of other implementations, and how different implementations may be mixed in a controlled and safe manner. To explore the feasibility of the approach, gauge its usability, and identify any issues that may crop up in practical usage, we have refactored a fairly large existing API-based application suite, and we report on the experience gained in the process
Sound Atomicity Inference for Data-Centric Synchronization
Data-Centric Concurrency Control (DCCC) shifts the reasoning about
concurrency restrictions from control structures to data declaration. It is a
high-level declarative approach that abstracts away from the actual concurrency
control mechanism(s) in use. Despite its advantages, the practical use of DCCC
is hindered by the fact that it may require many annotations and/or multiple
implementations of the same method to cope with differently qualified
parameters. Moreover, the existing DCCC solutions do not address the use of
interfaces, precluding their use in most object-oriented programs. To overcome
these limitations, in this paper we present AtomiS, a new DCCC model based on a
rigorously defined type-sound programming language. Programming with AtomiS
requires only (atomic)-qualifying types of parameters and return values in
interface definitions, and of fields in class definitions. From this atomicity
specification, a static analysis infers the atomicity constraints that are
local to each method, considering valid only the method variants that are
consistent with the specification, and performs code generation for all valid
variants of each method. The generated code is then the target for automatic
injection of concurrency control primitives, by means of the desired automatic
technique and associated atomicity and deadlock-freedom guarantees, which can
be plugged-into the model's pipeline. We present the foundations for the AtomiS
analysis and synthesis, with formal guarantees that the generated program is
well-typed and that it corresponds behaviourally to the original one. The
proofs are mechanised in Coq. We also provide a Java implementation that
showcases the applicability of AtomiS in real-life programs
Predictive Monitoring against Pattern Regular Languages
In this paper, we focus on the problem of dynamically analysing concurrent
software against high-level temporal specifications. Existing techniques for
runtime monitoring against such specifications are primarily designed for
sequential software and remain inadequate in the presence of concurrency --
violations may be observed only in intricate thread interleavings, requiring
many re-runs of the underlying software. Towards this, we study the problem of
predictive runtime monitoring, inspired by the analogous problem of predictive
data race detection studied extensively recently. The predictive runtime
monitoring question asks, given an execution , if it can be soundly
reordered to expose violations of a specification.
In this paper, we focus on specifications that are given in regular
languages. Our notion of reorderings is trace equivalence, where an execution
is considered a reordering of another if it can be obtained from the latter by
successively commuting adjacent independent actions. We first show that the
problem of predictive admits a super-linear lower bound of , where
is the number of events in the execution, and is a parameter
describing the degree of commutativity. As a result, predictive runtime
monitoring even in this setting is unlikely to be efficiently solvable.
Towards this, we identify a sub-class of regular languages, called pattern
languages (and their extension generalized pattern languages). Pattern
languages can naturally express specific ordering of some number of (labelled)
events, and have been inspired by popular empirical hypotheses, the `small bug
depth' hypothesis. More importantly, we show that for pattern (and generalized
pattern) languages, the predictive monitoring problem can be solved using a
constant-space streaming linear-time algorithm
- …