11 research outputs found
Sound Static Deadlock Analysis for C/Pthreads (Extended Version)
We present a static deadlock analysis approach for C/pthreads. The design of
our method has been guided by the requirement to analyse real-world code. Our
approach is sound (i.e., misses no deadlocks) for programs that have defined
behaviour according to the C standard, and precise enough to prove
deadlock-freedom for a large number of programs. The method consists of a
pipeline of several analyses that build on a new context- and thread-sensitive
abstract interpretation framework. We further present a lightweight dependency
analysis to identify statements relevant to deadlock analysis and thus speed up
the overall analysis. In our experimental evaluation, we succeeded to prove
deadlock-freedom for 262 programs from the Debian GNU/Linux distribution with
in total 2.6 MLOC in less than 11 hours
Compositional Verification of Compiler Optimisations on Relaxed Memory
This paper is about verifying program transformations on an axiomatic
relaxed memory model of the kind used in C/C++ and Java. Relaxed models
present particular challenges for verifying program transformations, because
they generate many additional modes of interaction between code and context.
For a block of code being transformed, we define a denotation from its behaviour
in a set of representative contexts. Our denotation summarises interactions of the
code block with the rest of the program both through local and global variables,
and through subtle synchronisation effects due to relaxed memory. We can then
prove that a transformation does not introduce new program behaviours by comparing
the denotations of the code block before and after. Our approach is compositional:
by examining only representative contexts, transformations are verified
for any context. It is also fully abstract, meaning any valid transformation can be
verified. We cover several tricky aspects of C/C++-style memory models, including
release-acquire operations, sequentially consistent fences, and non-atomics.
We also define a variant of our denotation that is finite at the cost of losing full
abstraction. Based on this variant, we have implemented a prototype verification
tool and ap
Polarimetric Properties of Event Horizon Telescope Targets from ALMA
We present the results from a full polarization study carried out with the Atacama Large Millimeter/submillimeter Array (ALMA) during the first Very Long Baseline Interferometry (VLBI) campaign, which was conducted in 2017 April in the λ3 mm and λ1.3 mm bands, in concert with the Global mm-VLBI Array (GMVA) and the Event Horizon Telescope (EHT), respectively. We determine the polarization and Faraday properties of all VLBI targets, including Sgr A*, M87, and a dozen radio-loud active galactic nuclei (AGNs), in the two bands at several epochs in a time window of 10 days. We detect high linear polarization fractions (2%–15%) and large rotation measures (RM > 103.3–105.5 rad m−2), confirming the trends of previous AGN studies at millimeter wavelengths. We find that blazars are more strongly polarized than other AGNs in the sample, while exhibiting (on average) order-of-magnitude lower RM values, consistent with the AGN viewing angle unification scheme. For Sgr A* we report a mean RM of (−4.2 ± 0.3) × 105 rad m−2 at 1.3 mm, consistent with measurements over the past decade and, for the first time, an RM of (–2.1 ± 0.1) × 105 rad m−2 at 3 mm, suggesting that about half of the Faraday rotation at 1.3 mm may occur between the 3 mm photosphere and the 1.3 mm source. We also report the first unambiguous measurement of RM toward the M87 nucleus at millimeter wavelengths, which undergoes significant changes in magnitude and sign reversals on a one year timescale, spanning the range from −1.2 to 0.3 × 105 rad m−2 at 3 mm and −4.1 to 1.5 × 105 rad m−2 at 1.3 mm. Given this time variability, we argue that, unlike the case of Sgr A*, the RM in M87 does not provide an accurate estimate of the mass accretion rate onto the black hole. We put forward a two-component model, comprised of a variable compact region and a static extended region, that can simultaneously explain the polarimetric properties observed by both the EHT (on horizon scales) and ALMA (which observes the combined emission from both components). These measurements provide critical constraints for the calibration, analysis, and interpretation of simultaneously obtained VLBI data with the EHT and GMVA
Lock correctness
Locks are a frequently used synchronisation mechanism in shared memory concurrent programs. They are used to enforce atomicity of certain code portions, avoid undefined behaviour due to data races, and hide weak memory effects of the underlying hardware architectures (i.e., they provide the illusion of interleaved execution). To provide these guarantees, the correct interplay of a number of subsystems is required. We distinguish between the application level, the transformation level, and the hardware level. On the application level, the programmer is required to correctly use the locks. This amounts to avoiding data races, deadlocks, and other errors in using the locking primitives, such as unlocking a lock that is not currently held. On the transformation level, the compiler needs to correctly optimise the program and correctly map its operations to machine code. This requires knowing, for example, when it is safe to move a code statement in a thread past a lock operation such that the resulting thread is a refinement of the original thread. On the hardware level, the lock operations themselves need to be implemented correctly, by usage of low-level primitives such as memory fences and read-modifywrite operations. This requires knowing the relaxations of memory ordering that could occur on the target hardware, and the effect of the primitives that can be used to restore consistency (such as memory fences). In this thesis, we address an aspect of each of the three levels of correctness mentioned above. On the application level, we provide a sound static approach for deadlock analysis of C/Pthreads programs. The approach is based on a contextand thread-sensitive abstract interpretation framework, and uses a lightweight dependency analysis to identify statements relevant to deadlock analysis. To quantify scalability, we have applied our approach to a large number of concurrent programs from the Debian GNU/Linux distribution. On the transformation level, we provide a new theory of refinement between threads, which is phrased in terms of state transitions between lock operations. We show that the theory is more precise than existing approaches, and that its application in a compiler testing setting leads to large performance gains compared to a previous approach. On the hardware level, we provide a toolchain to test the memory model of GPUs and the behaviour of code running on them. We automatically generate short concurrent code snippets that, when run on hardware, reveal interesting properties about the underlying memory model. These code snippets include idioms that typically appear in implementations of synchronisation operations. We further manually test several GPU locking primitives. Our testing has revealed surprising hardware behaviours and bugs in lock implementations.</p
Lock correctness
Locks are a frequently used synchronisation mechanism in shared memory concurrent
programs. They are used to enforce atomicity of certain code portions, avoid
undefined behaviour due to data races, and hide weak memory effects of the underlying
hardware architectures (i.e., they provide the illusion of interleaved execution).
To provide these guarantees, the correct interplay of a number of subsystems is
required. We distinguish between the application level, the transformation level,
and the hardware level.
On the application level, the programmer is required to correctly use the locks.
This amounts to avoiding data races, deadlocks, and other errors in using the
locking primitives, such as unlocking a lock that is not currently held.
On the transformation level, the compiler needs to correctly optimise the
program and correctly map its operations to machine code. This requires knowing,
for example, when it is safe to move a code statement in a thread past a lock
operation such that the resulting thread is a refinement of the original thread.
On the hardware level, the lock operations themselves need to be implemented
correctly, by usage of low-level primitives such as memory fences and read-modifywrite
operations. This requires knowing the relaxations of memory ordering that
could occur on the target hardware, and the effect of the primitives that can be
used to restore consistency (such as memory fences).
In this thesis, we address an aspect of each of the three levels of correctness
mentioned above. On the application level, we provide a sound static approach for
deadlock analysis of C/Pthreads programs. The approach is based on a contextand
thread-sensitive abstract interpretation framework, and uses a lightweight
dependency analysis to identify statements relevant to deadlock analysis. To
quantify scalability, we have applied our approach to a large number of concurrent
programs from the Debian GNU/Linux distribution.
On the transformation level, we provide a new theory of refinement between
threads, which is phrased in terms of state transitions between lock operations.
We show that the theory is more precise than existing approaches, and that its
application in a compiler testing setting leads to large performance gains compared
to a previous approach.
On the hardware level, we provide a toolchain to test the memory model of
GPUs and the behaviour of code running on them. We automatically generate
short concurrent code snippets that, when run on hardware, reveal interesting
properties about the underlying memory model. These code snippets include
idioms that typically appear in implementations of synchronisation operations. We
further manually test several GPU locking primitives. Our testing has revealed
surprising hardware behaviours and bugs in lock implementations.</p
Sound static deadlock analysis for C/Pthreads
We present a static deadlock analysis for C/Pthreads. The design of our method has been guided by the requirement to analyse real-world code. Our approach is sound (i.e., misses no deadlocks) for programs that have defined behaviour according to the C standard and the Pthreads specification, and is precise enough to prove deadlock-freedom for a large number of such programs. The method consists of a pipeline of several analyses that build on a new context- and thread-sensitive abstract interpretation framework. We further present a lightweight dependency analysis to identify statements relevant to deadlock analysis and thus speed up the overall analysis. In our experimental evaluation, we succeeded to prove deadlock-freedom for 292 programs from the Debian GNU/Linux distribution with in total 2.3 MLOC in 4 hours
Don't Sit on the Fence − A Static Analysis Approach to Automatic Fence Insertion
Modern architectures rely on memory fences to prevent undesired weakenings of memory consistency. As the fences’ semantics may be subtle, the automation of their placement is highly desirable. But precise methods for restoring consistency do not scale to deployed systems code. We choose to trade some precision for genuine scalability: our technique is suitable for large code bases. We implement it in our new musketeer tool, and report experiments on more than 700 executables from packages found in Debian GNU/Linux 7.1, including memcached with about 10,000 LoC
Sound static deadlock analysis for C/Pthreads (extended version)
We present a static deadlock analysis approach for C/pthreads. The design of our method has been guided by the requirement to analyse real-world code. Our approach is sound (i.e., misses no deadlocks) for programs that have defined behaviour according to the C standard, and precise enough to prove deadlock-freedom for a large number of programs. The method consists of a pipeline of several analyses that build on a new context- and thread-sensitive abstract interpretation framework. We further present a lightweight dependency analysis to identify statements relevant to deadlock analysis and thus speed up the overall analysis. In our experimental evaluation, we succeeded to prove deadlock-freedom for 262 programs from the Debian GNU/Linux distribution with in total 2.6 MLOC in less than 11 hours