1,078 research outputs found
Verification and Enforcement of Safe Schedules for Concurrent Programs
Automated software verification can prove the correctness of a
program with respect to a given specification and may be a valuable
support in the difficult task of ensuring the quality of large
software systems. However, the automated verification of concurrent
software can be particularly challenging due to the vast complexity
that non-deterministic scheduling causes.
This thesis is concerned with techniques that reduce the complexity
of concurrent programs in order to ease the verification task. We
approach this problem from two orthogonal directions: state space
reduction and reduction of non-determinism in executions of
concurrent programs.
Following the former direction, we present an algorithm for dynamic
partial-order reduction, a state space reduction technique that
avoids the verification of redundant executions. Our algorithm,
EPOR, eagerly creates schedules for program fragments. In
comparison to other dynamic partial-order reduction algorithms, it
avoids redundant race and dependency checks. Our experiments show
that EPOR runs considerably faster than a state-of-the-art
algorithm, which allows in several cases to analyze programs with a
higher number of threads within a given timeout.
In the latter direction, we present a formal framework for using
incomplete verification results to extract safe schedulers. As
incomplete verification results do not need to proof the correctness
of all possible executions of a program, their complexity can be
significantly lower than complete verification results. Hence, they
can be faster obtained. We constrain the scheduling of programs but
not their inputs in order to preserve their full functionality. In
our framework, executions under the scheduling constraints of an
incomplete verification result are safe, deadlock-free, and fair. We
instantiate our framework with the Impact model checking algorithm
and find in our evaluation that it can be used to model check
programs that are intractable for monolithic model checkers,
synthesize synchronization via assume statements, and
guarantee fair executions.
In order to safely execute a program within the set of executions
covered by an incomplete verification, scheduling needs to be
constrained. We discuss how to extract and encode schedules from
incomplete verification results, for both finite and infinite
executions, and how to efficiently enforce scheduling constraints,
both in terms of reducing the time to look up permission of
executing the next event and executing independent events
concurrently (by applying partial-order reduction).
A drawback of enforcing scheduling constraints is a potential
overhead in the execution time. However, in several cases,
constrained executions turned out to be even faster than
unconstrained executions. Our experimental results show that
iteratively relaxing a schedule can significantly reduce this
overhead. Hence, it is possible to adjust the incurred execution
time overhead in order to find a sweet spot with respect to the
amount of effort for creating schedules (i.e., the duration of
verification). Interestingly, we found cases in which a much earlier
reduction of execution time overhead is obtained by choosing
favorable scheduling constraints, which suggests that execution time
performance does not simply rely on the number of scheduling
constraints but to a large extend also on their structure
Extracting Safe Thread Schedules from Incomplete Model Checking Results
Model checkers frequently fail to completely verify a concurrent program, even if partial-order reduction is applied. The verification engineer is left in doubt whether the program is safe and the effort toward verifying the program is wasted. We present a technique that uses the results of such incomplete verification attempts to construct a (fair) scheduler that allows the safe execution of the partially verified concurrent program. This scheduler restricts the execution to schedules that have been proven safe (and prevents executions that were found to be erroneous). We evaluate the performance of our technique and show how it can be improved using partial-order reduction. While constraining the scheduler results in a considerable performance penalty in general, we show that in some cases our approach—somewhat surprisingly—even leads to faster executions
Optimizing work stealing algorithms with scheduling constraints
The fork-join paradigm of concurrent expression has gained popularity in conjunction with work-stealing schedulers. Random work-stealing schedulers have been shown to effectively perform dynamic load balancing, yielding provably-efficient schedules and space bounds on shared-memory architectures with uniform memory models. However, the advent of hierarchical, non-uniform multicore systems and large-scale distributed-memory architectures has reduced the efficacy of these scheduling policies. Furthermore, random work stealing schedulers do not exploit persistence within iterative, scientific applications.
In this thesis, we prove several properties of work-stealing schedulers that enable online tracing of the tasks with very low overhead. We then describe new scheduling policies that use online schedule introspection to understand scheduler placement and thus improve the performance on NUMA and distributed-memory architectures. Finally, by incorporating an inclusive data effect system into fork--join programs with schedule placement knowledge, we show how we can transform a fork-join program to significantly improve locality
Inter-workgroup barrier synchronisation on graphics processing units
GPUs are parallel devices that are able to run thousands of
independent threads concurrently. Traditional GPU programs are
data-parallel, requiring little to no communication,
i.e. synchronisation, between threads. However, classical concurrency
in the context of CPUs often exploits synchronisation idioms that are
not supported on GPUs. By studying such idioms on GPUs, with an aim to
facilitate them in a portable way, a wider and more generic space of
GPU applications can be made possible.
While the breadth of this thesis extends to many aspects of GPU
systems, the common thread throughout is the global barrier: an
execution barrier that synchronises all threads executing a GPU
application. The idea of such a barrier might seem straightforward,
however this investigation reveals many challenges and insights. In
particular, this thesis includes the following studies:
Execution models: while a general global barrier can deadlock due to
starvation on GPUs, it is shown that the scheduling guarantees of
current GPUs can be used to dynamically create an execution
environment that allows for a safe and portable global barrier
across a subset of the GPU threads.
Application optimisations: a set GPU optimisations are examined that
are tailored for graph applications, including one optimisation
enabled by the global barrier. It is shown that these optimisations
can provided substantial performance improvements, e.g. the barrier
optimisation achieves over a 10X speedup on AMD and Intel GPUs. The
performance portability of these optimisations is investigated, as
their utility varies across input, application, and architecture.
Multitasking: because many GPUs do not support preemption,
long-running GPU compute tasks (e.g. applications that use the
global barrier) may block other GPU functions, including graphics. A
simple cooperative multitasking scheme is proposed that allows
graphics tasks to meet their deadlines with reasonable overheads.Open Acces
Dynamic Network Topologies
Demand for effective network defense capabilities continues to increase as cyber attacks occur more and more frequently and gain more and more prominence in the media. Current security practices stop after data encryption and network address filtering. Security at the lowest level of network infrastructure allows for greater control of how the network traffic flows around the network. This research details two methods for extending security practices to the physical layer of a network by modifying the network infrastructure. The first method adapts the Advanced Encryption Standard while the second method uses a Steiner tree. After the network connections are updated, the traffic is re-routed using an approximation algorithm to solve the resulting multicommodity flow problem. The results show that modifying the network connections provides additional security to the information. Additionally, this research extends on previous research by addressing enterprise-size networks; networks between 5 and 1000 nodes with 1 through 5 interfaces are tested. While the final configuration depends greatly on the starting network infrastructure, the speed of the execution time enables administrators to make infrastructure adjustments in response to active cyber attacks
Computer Aided Verification
The open access two-volume set LNCS 11561 and 11562 constitutes the refereed proceedings of the 31st International Conference on Computer Aided Verification, CAV 2019, held in New York City, USA, in July 2019. The 52 full papers presented together with 13 tool papers and 2 case studies, were carefully reviewed and selected from 258 submissions. The papers were organized in the following topical sections: Part I: automata and timed systems; security and hyperproperties; synthesis; model checking; cyber-physical systems and machine learning; probabilistic systems, runtime techniques; dynamical, hybrid, and reactive systems; Part II: logics, decision procedures; and solvers; numerical programs; verification; distributed systems and networks; verification and invariants; and concurrency
- …