Search CORE

1,163 research outputs found

High-level synthesis of fine-grained weakly consistent C concurrency

Author: Ramanathan Nadesh
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 22/05/2019
Field of study

High-level synthesis (HLS) is the process of automatically compiling high-level programs into a netlist (collection of gates). Given an input program, HLS tools exploit its inherent parallelism and pipelining opportunities to generate efficient customised hardware. C-based programs are the most popular input for HLS tools, but these tools historically only synthesise sequential C programs. As the appeal for software concurrency rises, HLS tools are beginning to synthesise concurrent C programs, such as C/C++ pthreads and OpenCL. Although supporting software concurrency leads to better hardware parallelism, shared memory synchronisation is typically serialised to ensure correct memory behaviour, via locks. Locks are safety resources that ensure exclusive access of shared memory, eliminating data races and providing synchronisation guarantees for programmers. As an alternative to lock-based synchronisation, the C memory model also defines the possibility of lock-free synchronisation via fine-grained atomic operations (`atomics'). However, most HLS tools either do not support atomics at all or implement atomics using locks. Instead, we treat the synthesis of atomics as a scheduling problem. We show that we can augment the intra-thread memory constraints during memory scheduling of concurrent programs to support atomics. On average, hardware generated by our method is 7.5x faster than the state-of-the-art, for our set of experiments. Our method of synthesising atomics enables several unique possibilities. Chiefly, we are capable of supporting weakly consistent (`weak') atomics, which necessitate fewer ordering constraints compared to sequentially consistent (SC) atomics. However, implementing weak atomics is complex and error-prone and hence we formally verify our methods via automated model checking to ensure our generated hardware is correct. Furthermore, since the C memory model defines memory behaviour globally, we can globally analyse the entire program to generate its memory constraints. Additionally, we can also support loop pipelining by extending our methods to generate inter-iteration memory constraints. On average, weak atomics, global analysis and loop pipelining improve performance by 1.6x, 3.4x and 1.4x respectively, for our set of experiments. Finally, we present a case study of a real-world example via an HLS-based Google PageRank algorithm, whose performance improves by 4.4x via lock-free streaming and work-stealing.Open Acces

ZENODO

Spiral - Imperial College Digital Repository

Automated Failure Explanation Through Execution Comparison

Author: Sumner William Nicholas
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2013
Field of study

When fixing a bug in software, developers must build an understanding or explanation of the bug and how the bug flows through a program. The effort that developers must put into building this explanation is costly and laborious. Thus, developers need tools that can assist them in explaining the behavior of bugs. Dynamic slicing is one technique that can effectively show how a bug propagates through an execution up to the point where a program fails. However, dynamic slices are large because they do not just explain the bug itself; they include extra information that explains any observed behavior that might be connected to the bug. Thus, the explanation of the bug is hidden within this other tangentially related information. This dissertation addresses the problem and shows how a failing execution and a correct execution may be compared in order to construct explanations that include only information about what caused the bug. As a result, these automated explanations are significantly more concise than those explanations produced by existing dynamic slicing techniques. To enable the comparison of executions, we develop new techniques for dynamic analyses that identify the commonalities and differences between executions. First, we devise and implement the notion of a point within an execution that may exist across multiple executions. We also note that comparing executions involves comparing the state or variables and their values that exist within the executions at different execution points. Thus, we design an approach for identifying the locations of variables in different executions so that their values may be compared. Leveraging these tools, we design a system for identifying the behaviors within an execution that can be blamed for a bug and that together compose an explanation for the bug. These explanations are up to two orders of magnitude smaller than those produced by existing state of the art techniques. We also examine how different choices of a correct execution for comparison can impact the practicality or potential quality of the explanations produced via our system

CiteSeerX

Recommended from our members

Systematic techniques for more effective fault localization and program repair

Author: Gopinath Divya
Publication venue
Publication date: 24/02/2016
Field of study

Debugging faulty code is a tedious process that is often quite expensive and can require much manual effort. Developers typically perform debugging in two key steps: (1) fault localization, i.e., identifying the location of faulty line(s) of code; and (2) program repair, i.e., modifying the code to remove the fault(s). Automating debugging to reduce its cost has been the focus of a number of research projects during the last decade, which have introduced a variety of techniques. However, existing techniques suffer from two basic limitations. One, they lack accuracy to handle real programs. Two, they focus on automating only one of the two key steps, thereby leaving the other key step to the developer. Our thesis is that an approach that integrates systematic search based on state-of-the-art constraint solvers with techniques to analyze artifacts that describe application specific properties and behaviors, provides the basis for developing more effective debugging techniques. We focus on faults in programs that operate on structurally complex inputs, such as heap-allocated data or relational databases. Our approach lays the foundation for a unified framework for localization and repair of faults in programs. We embody our thesis in a suite of integrated techniques based on propositional satisfiability solving, correctness specifications analysis, test-spectra analysis, and rule-learning algorithms from machine learning, implement them as a prototype tool-set, and evaluate them using several subject programs.Electrical and Computer Engineerin

Texas ScholarWorks

Recommended from our members

Software integration testing based on communication coverage criteria and partial model generation

Author: Hierons RM
Liggesmeyer P
Poore J
Robinson-Mallett C
Publication venue
Publication date: 01/01/2008
Field of study

This paper considers the problem of integration testing the components of a timed distributed software system. We assume that communication between the components is specified using timed interface automata and use computational tree logic (CTL) to define communication-based coverage criteria that refer to send- and receive-statements and communication paths. The proposed method enables testers to focus during component integration on such parts of the specification, e.g. behaviour specifications or Markovian usage models, that are involved in the communication between components to be integrated. A more specific application area of this approach is the integration of test-models, e.g. a transmission gear can be tested based on separated models for the driver behaviour, the engine condition, and the mechanical and hydraulical transmission states. Given such a state-based specification of a distributed system and a concrete coverage goal, a model checker is used in order to determine the coverage or generate test sequences that achieve the goal. Given the generated test sequences we derive a partial test-model of the components from which the test sequences are derived. The partial model can be used to drive further testing and can also be used as the basis for producing additional partial models in incremental integration testing. While the process of deriving the test sequences could suffer from a combinatorial explosion, the effort required to generate the partial model is polynomial in the number of test sequences and their length. Thus, where it is not feasible to produce test sequences that achieve a given type of coverage it is still possible to produce a partial model on the basis of test sequences generated to achieve some other criterion. As a result, the process of generating a partial model has the potential to scale to large industrial software systems. While a particular model checker, UPPAAL, was used, it should be relatively straightforward to adapt the approach for use with other CTL based model checkers. A potential additional benefit of the approach is that it provides a visual description of the state-based testing of distributed systems, which may be beneficial in other contexts such as education and comprehension

Brunel University Research Archive