831 research outputs found
GPUVerify: A Verifier for GPU Kernels
We present a technique for verifying race- and divergence-freedom of GPU kernels that are written in mainstream ker-nel programming languages such as OpenCL and CUDA. Our approach is founded on a novel formal operational se-mantics for GPU programming termed synchronous, delayed visibility (SDV) semantics. The SDV semantics provides a precise definition of barrier divergence in GPU kernels and allows kernel verification to be reduced to analysis of a sequential program, thereby completely avoiding the need to reason about thread interleavings, and allowing existing modular techniques for program verification to be leveraged. We describe an efficient encoding for data race detection and propose a method for automatically inferring loop invari-ants required for verification. We have implemented these techniques as a practical verification tool, GPUVerify, which can be applied directly to OpenCL and CUDA source code. We evaluate GPUVerify with respect to a set of 163 kernels drawn from public and commercial sources. Our evaluation demonstrates that GPUVerify is capable of efficient, auto-matic verification of a large number of real-world kernels
Using Colored Stochastic Petri Net (CS-PN) software for protocol specification, validation, and evaluation
The specification, verification, validation, and evaluation, which make up the different steps of the CS-PN software are outlined. The colored stochastic Petri net software is applied to a Wound/Wait protocol decomposable into two principal modules: request or couple (transaction, granule) treatment module and wound treatment module. Each module is specified, verified, validated, and then evaluated separately, to deduce a verification, validation and evaluation of the complete protocol. The colored stochastic Petri nets tool is shown to be a natural extension of the stochastic tool, adapted to distributed systems and protocols, because the color conveniently takes into account the numerous sites, transactions, granules and messages
Using Flow Specifications of Parameterized Cache Coherence Protocols for Verifying Deadlock Freedom
We consider the problem of verifying deadlock freedom for symmetric cache
coherence protocols. In particular, we focus on a specific form of deadlock
which is useful for the cache coherence protocol domain and consistent with the
internal definition of deadlock in the Murphi model checker: we refer to this
deadlock as a system- wide deadlock (s-deadlock). In s-deadlock, the entire
system gets blocked and is unable to make any transition. Cache coherence
protocols consist of N symmetric cache agents, where N is an unbounded
parameter; thus the verification of s-deadlock freedom is naturally a
parameterized verification problem. Parametrized verification techniques work
by using sound abstractions to reduce the unbounded model to a bounded model.
Efficient abstractions which work well for industrial scale protocols typically
bound the model by replacing the state of most of the agents by an abstract
environment, while keeping just one or two agents as is. However, leveraging
such efficient abstractions becomes a challenge for s-deadlock: a violation of
s-deadlock is a state in which the transitions of all of the unbounded number
of agents cannot occur and so a simple abstraction like the one above will not
preserve this violation. In this work we address this challenge by presenting a
technique which leverages high-level information about the protocols, in the
form of message sequence dia- grams referred to as flows, for constructing
invariants that are collectively stronger than s-deadlock. Efficient
abstractions can be constructed to verify these invariants. We successfully
verify the German and Flash protocols using our technique
Automatic visual recognition using parallel machines
Invariant features and quick matching algorithms are two major concerns in the area of automatic visual recognition. The former reduces the size of an established model database, and the latter shortens the computation time. This dissertation, will discussed both line invariants under perspective projection and parallel implementation of a dynamic programming technique for shape recognition. The feasibility of using parallel machines can be demonstrated through the dramatically reduced time complexity.
In this dissertation, our algorithms are implemented on the AP1000 MIMD parallel machines. For processing an object with a features, the time complexity of the proposed parallel algorithm is O(n), while that of a uniprocessor is O(n2). The two applications, one for shape matching and the other for chain-code extraction, are used in order to demonstrate the usefulness of our methods.
Invariants from four general lines under perspective projection are also discussed in here. In contrast to the approach which uses the epipolar geometry, we investigate the invariants under isotropy subgroups. Theoretically speaking, two independent invariants can be found for four general lines in 3D space. In practice, we show how to obtain these two invariants from the projective images of four general lines without the need of camera calibration.
A projective invariant recognition system based on a hypothesis-generation-testing scheme is run on the hypercube parallel architecture. Object recognition is achieved by matching the scene projective invariants to the model projective invariants, called transfer. Then a hypothesis-generation-testing scheme is implemented on the hypercube parallel architecture
- …